使用 tryCatch()

我们正在定义一个功能强大的版本,它从给定的 URL 读取 HTML 代码。稳健的,我们希望它处理的情况下的东西要么出错(错误)或不太我们打算给它(警告)的方式感。错误和警告的总称是条件

使用 tryCatch 的功能定义

readUrl <- function(url) {
    out <- tryCatch(

        ########################################################
        # Try part: define the expression(s) you want to "try" #
        ########################################################

        {
            # Just to highlight: 
            # If you want to use more than one R expression in the "try part" 
            # then you'll have to use curly brackets. 
            # Otherwise, just write the single expression you want to try and 

            message("This is the 'try' part")
            readLines(con = url, warn = FALSE) 
        },

        ########################################################################
        # Condition handler part: define how you want conditions to be handled #
        ########################################################################

        # Handler when a warning occurs:
        warning = function(cond) {
            message(paste("Reading the URL caused a warning:", url))
            message("Here's the original warning message:")
            message(cond)

            # Choose a return value when such a type of condition occurs
            return(NULL)
        },

        # Handler when an error occurs:
        error = function(cond) {
            message(paste("This seems to be an invalid URL:", url))
            message("Here's the original error message:")
            message(cond)

            # Choose a return value when such a type of condition occurs
            return(NA)
        },

        ###############################################
        # Final part: define what should happen AFTER #
        # everything has been tried and/or handled    #
        ###############################################

        finally = {
            message(paste("Processed URL:", url))
            message("Some message at the end\n")
        }
    )    
    return(out)
}

测试一下

让我们定义一个 URL 向量,其中一个元素不是有效的 URL

urls <- c(
    "http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
    "http://en.wikipedia.org/wiki/Xz",
    "I'm no URL"
)

并将此作为输入传递给我们上面定义的函数

y <- lapply(urls, readUrl)
# Processed URL: http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html
# Some message at the end
#
# Processed URL: http://en.wikipedia.org/wiki/Xz
# Some message at the end
#
# URL does not seem to exist: I'm no URL 
# Here's the original error message:
# cannot open the connection
# Processed URL: I'm no URL
# Some message at the end
#
# Warning message:
# In file(con, "r") : cannot open file 'I'm no URL': No such file or directory

调查输出

length(y)
# [1] 3

head(y[[1]])
# [1] "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">"      
# [2] "<html><head><title>R: Functions to Manipulate Connections</title>"      
# [3] "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">"
# [4] "<link rel=\"stylesheet\" type=\"text/css\" href=\"R.css\">"             
# [5] "</head><body>"                                                          
# [6] ""    

y[[3]]
# [1] NA