Methods for catching and analyzing JavaScript errors
- 2020-03-30 02:26:43
- OfStack
How do I catch and analyze JavaScript errors
Front-end engineers know that JavaScript has basic exception-handling capabilities. We can throw new Error(), and the browser will throw an exception when we call the API in Error. But it is estimated that most front-end engineers do not consider collecting such abnormal information. Anyway, as long as the JavaScript error after the refresh does not reproduce, then the user can be refreshed to solve the problem, the browser will not crash, when it did not happen. This assumption was true before Single Page apps became popular. Now, the Single Page App has been running for a while and the state is so complicated that the user may have to do a number of inputs to get here. Shouldn't the previous operation be completely redone? So it is still necessary to capture and analyze the exception information, and then we can modify the code to avoid affecting the user experience.
The way exceptions are caught
Throw new Error(), which we wrote ourselves, can be caught if we want to catch it, because we know exactly where throw is. Exceptions that occur when calling browser apis are not always as easy to catch. Some apis are written in the standard to throw exceptions, and some apis are thrown only by individual browsers because of implementation differences or defects. For the former we can also catch by try-catch, for the latter we must listen for global exceptions and catch them.
try-catch
If some browser apis are known to throw exceptions, we need to put the call in a try-catch to avoid an error that could cause the entire program to go into an illegal state. Window.localstorage, for example, is an API that throws an exception when the volume limit for writing data is exceeded, even in Safari's private browsing mode.
Another common try-catch scenario is a callback. Because the code for the callback function is out of our control, we don't know the quality of the code or whether it will call other apis that throw exceptions. It is necessary to put the call back into a try-catch in order not to cause the rest of the code after the callback to fail because of a callback error.
Where try-catch does not cover an exception, it can only be caught by window.onerror.
Be careful not to be too clever by listening for window.onerror in the form of window.addeventlistener or window.attachevent. Many browsers only implement window.onerror, or only the implementation of window.onerror is standard. Considering that the draft standard also defines window.onerror, we should use window.onerror instead.
Property loss
Suppose we had a reportError function that collected captured exceptions and sent them in bulk to the server for storage for query analysis. What information would we want to collect? Useful information includes: error type (name), error message (message), script file address (script), line number (line), column number (column), and stack trace (stack). If an exception is caught by try-catch, this information is on the Error object (supported by most major browsers), so reportError can collect this information as well. However, if it is caught through window.onerror, we all know that this event function only has three parameters, so information other than these three parameters is lost.
Serialized message
If the Error object is created by ourselves, then the error.message is controlled by us. Basically what we put in the error.message, the first parameter of window.onerror is going to be. (browsers actually make minor changes, such as prefixing 'Uncaught Error: '.) So we can serialize the properties we care about (such as json.stringify) and store them in the error.message, and then read out window.onerror and deserialize them. Of course, this is limited to the Error objects we create ourselves.
Fifth parameter
Browser vendors are also aware of the limitations of window.onerror, so they started adding new parameters to window.onerror. Considering that only the row number without the column number seems to be not very symmetric, IE first added the column number, in the fourth parameter. However, people were more concerned about getting the full stack, so Firefox said it would be better to put the stack in the fifth parameter. But Chrome says it's better to put the entire Error object in the fifth parameter and read whatever properties you want, including custom ones. As a result, the new window.onerror signature was implemented in Chrome 30 due to Chrome's fast action, resulting in the standard draft being written in the same way.
Attribute normalization
The names of the Error object attributes we discussed earlier are all based on the Chrome naming method, however, the naming method of the Error object attributes varies from browser to browser. For example, the address of the script file is called script in Chrome but is called filename in Firefox. Therefore, we also need a special function to normalize the Error object, that is, map the different attribute names to the uniform attribute names. Specific practice can refer to this article. Although the browser implementation will be updated, it is not too difficult to maintain such a mapping table manually.
Similar is the format of a stack trace (stack). This property stores a stack of information when an exception occurs in plain text, and since the text format is different from browser to browser, you also need to manually maintain a regular expression to extract the identifier, script, line, and column Numbers of each frame from the plain text.
Security restrictions
If you've ever encountered an error with a 'Script error.' message, you'll see what I'm talking about. The reason for this security restriction is this: if an e-bank returns different HTML after the user logs in than the anonymous user sees, a third-party website can place the e-bank's URI in the script.src property. HTML, of course, cannot be parsed as JS, so the browser will throw an exception, and the third-party site will be able to determine whether the user is logged in by parsing the exception's location. For this reason, the browser will filter the exceptions thrown by different Script files, leaving only a message of 'Script error.' and other attributes will disappear.
For a site of a certain size, it is normal for script files to be placed on the CDN with different sources. Now, even if you're building your own web site, common frameworks like jQuery and Backbone can refer directly to the version on the public CDN to speed up downloads. So this security restriction did cause some trouble, causing the exception information we collected from Chrome and Firefox to be useless 'Script errors.'
CORS
To get around this limitation, just make sure that the script file is the same as the page itself. But putting script files on a server without CDN acceleration will slow down the user's download speed. One solution is to keep the script file on the CDN, download the content back through CORS using XMLHttpRequest, and create it again. Script> Tags are injected into the page. The code embedded in the page is, of course, homologous.
This is simple to say, but there are many details to implement. Here's a simple example:
We all know that this step1, step2, and step3, if there are dependencies, must be executed in strict accordance with this order, otherwise it may go wrong. The browser can request the files for step1 and step3 in parallel, but the order is guaranteed at execution time. If we get the contents of the step1 and step3 files ourselves through XMLHttpRequest, we need to ensure that the order is correct. In addition, don't forget step2. Step2 can be executed when step1 is downloaded in non-blocking form, so we have to manually intervene with step2 to make it wait for step1 to complete.
If we had a whole set of tools to generate different pages on the site. Script> If we label it, we need to adjust the tool to make it right. Script> Label changes:
We need to implement the scheduleRemoteScript and scheduleInlineScript functions and make sure they are the first to reference the < Script> The label is defined before, and then the rest. Script> The label will be rewritten in this way. Notice that the step2 function, which was executed immediately, is put into a larger code function. The code function is not executed, it is just a container, which allows the original step2 code to be retained without escaping, but not executed immediately.
Next we need to implement a complete mechanism to ensure that the contents of the files downloaded from the scheduleRemoteScript by address and the code retrieved directly from scheduleInlineScript are executed one after another in the correct order. I will not give the detailed code here, you are interested in their own implementation.
The line number check
Getting the content through CORS and injecting the code into the page can break the security barrier, but introduces a new problem: line number conflicts. Originally, the unique script file can be located by error.script, and then the unique line number can be located by error.line. Now because it's all embedded code, multiple. Script> Tags cannot be distinguished by error.script, however each < Script> The line Numbers inside the tag are all counted from 1, so we cannot use the exception information to locate the source code where the error is.
To avoid line number collisions, we can waste some line Numbers so that each < Script> There are actual code lines in the label that do not overlap each other. For example, suppose each < Script> The actual code in the tag is not more than 1000 lines, so I can make the first < Script> The code in the tag occupies line 1 and 1000, leaving the second < Script> The code in the tag occupies line 1001, 2000 (previously inserted 1000 blank lines), the third < Script> The tag type code occupies line 2001, 3000 (previously inserted 2000 blank lines), and so on. We then use the data-* attribute to record this information for easy reverse lookup.
After this, if an error.line for an error is 3005, that means the actual error.script should be 'http://cdn.com/step3.js' and the actual error.line should be 5. We can do this in the previously mentioned reportError function.
Of course, since there is no way to guarantee that each script file has only 1000 lines, and it is possible that some script files are significantly less than 1000 lines, there is no need to assign a fixed interval of 1000 lines to each one. Script> The label. We can assign intervals based on the actual number of foot lines, as long as we guarantee each one. Script> The interval used by the label does not overlap with each other.
Crossorigin properties
Browser security restrictions on content from different sources are certainly not limited to. Script> The label. Since XMLHttpRequest can overcome this limitation with CORS, why not resources that are referenced directly from the tag? Of course it can.
In view of the < Script> The restriction of the tag referencing different source script files also applies < Img> Tags refer to different source image files. If an < Img> Tag is a different source once in < Canvas> Used in drawing, the < Canvas> This changes to a write-only state, ensuring that websites cannot steal unauthorized image data from different sources through JavaScript. Later < Img> The tag solves this problem by introducing the crossorigin attribute. If crossorigin="anonymous", it is equivalent to anonymous CORS; If 'crossorigin= "use-credentials", it is equivalent to CORS with certification.
Since < Img> Tags can do this, why? Script> The label can't do that, right? So browser vendors are Script> The same crossorigin attribute is added to the tag to address the above security limitation. Right now Chrome and Firefox support for this property is completely fine. Safari will treat crossorigin="anonymous" as crossorigin="use-credentials", as a result, if the server only supports anonymous CORS, Safari will consider it an authentication failure. Since the CDN server is designed to only return static content for performance reasons, it is not possible to dynamically return the HTTP headers needed to authenticate CORS on request, Safari simply cannot take advantage of this feature to solve the above problem.
conclusion
JavaScript exception handling may seem simple, just like any other language, but it's not that easy to actually catch all the exceptions and analyze the attributes. Although there are some third-party services that provide a Google Analytics like service for catching JavaScript exceptions, you still have to do it yourself to understand the details and principles.
Front-end engineers know that JavaScript has basic exception-handling capabilities. We can throw new Error(), and the browser will throw an exception when we call the API in Error. But it is estimated that most front-end engineers do not consider collecting such abnormal information. Anyway, as long as the JavaScript error after the refresh does not reproduce, then the user can be refreshed to solve the problem, the browser will not crash, when it did not happen. This assumption was true before Single Page apps became popular. Now, the Single Page App has been running for a while and the state is so complicated that the user may have to do a number of inputs to get here. Shouldn't the previous operation be completely redone? So it is still necessary to capture and analyze the exception information, and then we can modify the code to avoid affecting the user experience.
The way exceptions are caught
Throw new Error(), which we wrote ourselves, can be caught if we want to catch it, because we know exactly where throw is. Exceptions that occur when calling browser apis are not always as easy to catch. Some apis are written in the standard to throw exceptions, and some apis are thrown only by individual browsers because of implementation differences or defects. For the former we can also catch by try-catch, for the latter we must listen for global exceptions and catch them.
try-catch
If some browser apis are known to throw exceptions, we need to put the call in a try-catch to avoid an error that could cause the entire program to go into an illegal state. Window.localstorage, for example, is an API that throws an exception when the volume limit for writing data is exceeded, even in Safari's private browsing mode.
try {
localStorage.setItem('date', Date.now());
} catch (error) {
reportError(error);
}
Another common try-catch scenario is a callback. Because the code for the callback function is out of our control, we don't know the quality of the code or whether it will call other apis that throw exceptions. It is necessary to put the call back into a try-catch in order not to cause the rest of the code after the callback to fail because of a callback error.
listeners.forEach(function(listener) {
try {
listener();
} catch (error) {
reportError(error);
}
});
window.onerror
Where try-catch does not cover an exception, it can only be caught by window.onerror.
window.onerror =
function(errorMessage, scriptURI, lineNumber) {
reportError({
message: errorMessage,
script: scriptURI,
line: lineNumber
});
}
Be careful not to be too clever by listening for window.onerror in the form of window.addeventlistener or window.attachevent. Many browsers only implement window.onerror, or only the implementation of window.onerror is standard. Considering that the draft standard also defines window.onerror, we should use window.onerror instead.
Property loss
Suppose we had a reportError function that collected captured exceptions and sent them in bulk to the server for storage for query analysis. What information would we want to collect? Useful information includes: error type (name), error message (message), script file address (script), line number (line), column number (column), and stack trace (stack). If an exception is caught by try-catch, this information is on the Error object (supported by most major browsers), so reportError can collect this information as well. However, if it is caught through window.onerror, we all know that this event function only has three parameters, so information other than these three parameters is lost.
Serialized message
If the Error object is created by ourselves, then the error.message is controlled by us. Basically what we put in the error.message, the first parameter of window.onerror is going to be. (browsers actually make minor changes, such as prefixing 'Uncaught Error: '.) So we can serialize the properties we care about (such as json.stringify) and store them in the error.message, and then read out window.onerror and deserialize them. Of course, this is limited to the Error objects we create ourselves.
Fifth parameter
Browser vendors are also aware of the limitations of window.onerror, so they started adding new parameters to window.onerror. Considering that only the row number without the column number seems to be not very symmetric, IE first added the column number, in the fourth parameter. However, people were more concerned about getting the full stack, so Firefox said it would be better to put the stack in the fifth parameter. But Chrome says it's better to put the entire Error object in the fifth parameter and read whatever properties you want, including custom ones. As a result, the new window.onerror signature was implemented in Chrome 30 due to Chrome's fast action, resulting in the standard draft being written in the same way.
window.onerror = function(
errorMessage,
scriptURI,
lineNumber,
columnNumber,
error
) {
if (error) {
reportError(error);
} else {
reportError({
message: errorMessage,
script: scriptURI,
line: lineNumber .
column: columnNumber
});
}
}
Attribute normalization
The names of the Error object attributes we discussed earlier are all based on the Chrome naming method, however, the naming method of the Error object attributes varies from browser to browser. For example, the address of the script file is called script in Chrome but is called filename in Firefox. Therefore, we also need a special function to normalize the Error object, that is, map the different attribute names to the uniform attribute names. Specific practice can refer to this article. Although the browser implementation will be updated, it is not too difficult to maintain such a mapping table manually.
Similar is the format of a stack trace (stack). This property stores a stack of information when an exception occurs in plain text, and since the text format is different from browser to browser, you also need to manually maintain a regular expression to extract the identifier, script, line, and column Numbers of each frame from the plain text.
Security restrictions
If you've ever encountered an error with a 'Script error.' message, you'll see what I'm talking about. The reason for this security restriction is this: if an e-bank returns different HTML after the user logs in than the anonymous user sees, a third-party website can place the e-bank's URI in the script.src property. HTML, of course, cannot be parsed as JS, so the browser will throw an exception, and the third-party site will be able to determine whether the user is logged in by parsing the exception's location. For this reason, the browser will filter the exceptions thrown by different Script files, leaving only a message of 'Script error.' and other attributes will disappear.
For a site of a certain size, it is normal for script files to be placed on the CDN with different sources. Now, even if you're building your own web site, common frameworks like jQuery and Backbone can refer directly to the version on the public CDN to speed up downloads. So this security restriction did cause some trouble, causing the exception information we collected from Chrome and Firefox to be useless 'Script errors.'
CORS
To get around this limitation, just make sure that the script file is the same as the page itself. But putting script files on a server without CDN acceleration will slow down the user's download speed. One solution is to keep the script file on the CDN, download the content back through CORS using XMLHttpRequest, and create it again. Script> Tags are injected into the page. The code embedded in the page is, of course, homologous.
This is simple to say, but there are many details to implement. Here's a simple example:
<script src="http://cdn.com/step1.js"></script>
<script>
(function step2() {})();
</script>
<script src="http://cdn.com/step3.js"></script>
We all know that this step1, step2, and step3, if there are dependencies, must be executed in strict accordance with this order, otherwise it may go wrong. The browser can request the files for step1 and step3 in parallel, but the order is guaranteed at execution time. If we get the contents of the step1 and step3 files ourselves through XMLHttpRequest, we need to ensure that the order is correct. In addition, don't forget step2. Step2 can be executed when step1 is downloaded in non-blocking form, so we have to manually intervene with step2 to make it wait for step1 to complete.
If we had a whole set of tools to generate different pages on the site. Script> If we label it, we need to adjust the tool to make it right. Script> Label changes:
<script>
scheduleRemoteScript('http://cdn.com/step1.js');
</script>
<script>
scheduleInlineScript(function code() {
(function step2() {})();
});
</script>
<script>
scheduleRemoteScript('http://cdn.com/step3.js');
</script>
We need to implement the scheduleRemoteScript and scheduleInlineScript functions and make sure they are the first to reference the < Script> The label is defined before, and then the rest. Script> The label will be rewritten in this way. Notice that the step2 function, which was executed immediately, is put into a larger code function. The code function is not executed, it is just a container, which allows the original step2 code to be retained without escaping, but not executed immediately.
Next we need to implement a complete mechanism to ensure that the contents of the files downloaded from the scheduleRemoteScript by address and the code retrieved directly from scheduleInlineScript are executed one after another in the correct order. I will not give the detailed code here, you are interested in their own implementation.
The line number check
Getting the content through CORS and injecting the code into the page can break the security barrier, but introduces a new problem: line number conflicts. Originally, the unique script file can be located by error.script, and then the unique line number can be located by error.line. Now because it's all embedded code, multiple. Script> Tags cannot be distinguished by error.script, however each < Script> The line Numbers inside the tag are all counted from 1, so we cannot use the exception information to locate the source code where the error is.
To avoid line number collisions, we can waste some line Numbers so that each < Script> There are actual code lines in the label that do not overlap each other. For example, suppose each < Script> The actual code in the tag is not more than 1000 lines, so I can make the first < Script> The code in the tag occupies line 1 and 1000, leaving the second < Script> The code in the tag occupies line 1001, 2000 (previously inserted 1000 blank lines), the third < Script> The tag type code occupies line 2001, 3000 (previously inserted 2000 blank lines), and so on. We then use the data-* attribute to record this information for easy reverse lookup.
<script
data-src="http://cdn.com/step1.js"
data-line-start="1"
>
// code for step 1
</script>
<script data-line-start="1001">
// 'n' * 1000
// code for step 2
</script>
<script
data-src="http://cdn.com/step3.js"
data-line-start="2001"
>
// 'n' * 2000
// code for step 3
</script>
After this, if an error.line for an error is 3005, that means the actual error.script should be 'http://cdn.com/step3.js' and the actual error.line should be 5. We can do this in the previously mentioned reportError function.
Of course, since there is no way to guarantee that each script file has only 1000 lines, and it is possible that some script files are significantly less than 1000 lines, there is no need to assign a fixed interval of 1000 lines to each one. Script> The label. We can assign intervals based on the actual number of foot lines, as long as we guarantee each one. Script> The interval used by the label does not overlap with each other.
Crossorigin properties
Browser security restrictions on content from different sources are certainly not limited to. Script> The label. Since XMLHttpRequest can overcome this limitation with CORS, why not resources that are referenced directly from the tag? Of course it can.
In view of the < Script> The restriction of the tag referencing different source script files also applies < Img> Tags refer to different source image files. If an < Img> Tag is a different source once in < Canvas> Used in drawing, the < Canvas> This changes to a write-only state, ensuring that websites cannot steal unauthorized image data from different sources through JavaScript. Later < Img> The tag solves this problem by introducing the crossorigin attribute. If crossorigin="anonymous", it is equivalent to anonymous CORS; If 'crossorigin= "use-credentials", it is equivalent to CORS with certification.
Since < Img> Tags can do this, why? Script> The label can't do that, right? So browser vendors are Script> The same crossorigin attribute is added to the tag to address the above security limitation. Right now Chrome and Firefox support for this property is completely fine. Safari will treat crossorigin="anonymous" as crossorigin="use-credentials", as a result, if the server only supports anonymous CORS, Safari will consider it an authentication failure. Since the CDN server is designed to only return static content for performance reasons, it is not possible to dynamically return the HTTP headers needed to authenticate CORS on request, Safari simply cannot take advantage of this feature to solve the above problem.
conclusion
JavaScript exception handling may seem simple, just like any other language, but it's not that easy to actually catch all the exceptions and analyze the attributes. Although there are some third-party services that provide a Google Analytics like service for catching JavaScript exceptions, you still have to do it yourself to understand the details and principles.