Internet Explorer 物件

Set createInternetExplorerObject = CreateObject("InternetExplorer.Application")

工具>參考> Microsoft Internet Controls
相關 DLL:ieframe.dll
源:Internet Explorer 瀏覽器

MSDN-InternetExplorer 物件

通過自動化控制 Windows Internet Explorer 的例項。

Internet Explorer Objec 基本成員

下面的程式碼應該介紹 IE 物件的工作原理以及如何通過 VBA 對其進行操作。我建議單步執行它,否則在多次導航期間可能會出錯。

Sub IEGetToKnow()
    Dim IE As InternetExplorer 'Reference to Microsoft Internet Controls
    Set IE = New InternetExplorer
    
    With IE
        .Visible = True 'Sets or gets a value that indicates whether the object is visible or hidden.
        
        'Navigation
        .Navigate2 "http://www.example.com" 'Navigates the browser to a location that might not be expressed as a URL, such as a PIDL for an entity in the Windows Shell namespace.
        Debug.Print .Busy 'Gets a value that indicates whether the object is engaged in a navigation or downloading operation.
        Debug.Print .ReadyState 'Gets the ready state of the object.
        .Navigate2 "http://www.example.com/2"
        .GoBack 'Navigates backward one item in the history list
        .GoForward 'Navigates forward one item in the history list.
        .GoHome 'Navigates to the current home or start page.
        .Stop 'Cancels a pending navigation or download, and stops dynamic page elements, such as background sounds and animations.
        .Refresh 'Reloads the file that is currently displayed in the object.
        
        Debug.Print .Silent 'Sets or gets a value that indicates whether the object can display dialog boxes.
        Debug.Print .Type 'Gets the user type name of the contained document object.
        
        Debug.Print .Top 'Sets or gets the coordinate of the top edge of the object.
        Debug.Print .Left 'Sets or gets the coordinate of the left edge of the object.
        Debug.Print .Height 'Sets or gets the height of the object.
        Debug.Print .Width 'Sets or gets the width of the object.
    End With
    
    IE.Quit 'close the application window
End Sub

網頁蒐羅

與 IE 最常見的事情是抓取網站的一些資訊,或填寫網站表格並提交資訊。我們將看看如何做到這一點。

讓我們考慮一下 example.com 的 原始碼:

<!doctype html>
<html>
    <head>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
        <style ... </style> 
    </head>

    <body>
        <div>
            <h1>Example Domain</h1>
            <p>This domain is established to be used for illustrative examples in documents. You may use this
            domain in examples without prior coordination or asking for permission.</p>
            <p><a href="http://www.iana.org/domains/example">More information...</a></p>
        </div>
    </body>
</html>

我們可以使用下面的程式碼來獲取和設定資訊:

Sub IEWebScrape1()
    Dim IE As InternetExplorer 'Reference to Microsoft Internet Controls
    Set IE = New InternetExplorer
    
    With IE
        .Visible = True
        .Navigate2 "http://www.example.com"
        
        'we add a loop to be sure the website is loaded and ready.
        'Does not work consistently. Cannot be relied upon.
        Do While .Busy = True Or .ReadyState <> READYSTATE_COMPLETE 'Equivalent = .ReadyState <> 4
            ' DoEvents - worth considering. Know implications before you use it.
            Application.Wait (Now + TimeValue("00:00:01")) 'Wait 1 second, then check again.
        Loop
        
        'Print info in immediate window
        With .Document 'the source code HTML "below" the displayed page.
            Stop 'VBE Stop. Continue line by line to see what happens.
            Debug.Print .GetElementsByTagName("title")(0).innerHtml 'prints "Example Domain"
            Debug.Print .GetElementsByTagName("h1")(0).innerHtml 'prints "Example Domain"
            Debug.Print .GetElementsByTagName("p")(0).innerHtml 'prints "This domain is established..."
            Debug.Print .GetElementsByTagName("p")(1).innerHtml 'prints "<a href="http://www.iana.org/domains/example">More information...</a>"
            Debug.Print .GetElementsByTagName("p")(1).innerText 'prints "More information..."
            Debug.Print .GetElementsByTagName("a")(0).innerText 'prints "More information..."
            
            'We can change the localy displayed website. Don't worry about breaking the site.
            .GetElementsByTagName("title")(0).innerHtml = "Psst, scraping..."
            .GetElementsByTagName("h1")(0).innerHtml = "Let me try something fishy." 'You have just changed the local HTML of the site.
            .GetElementsByTagName("p")(0).innerHtml = "Lorem ipsum........... The End"
            .GetElementsByTagName("a")(0).innerText = "iana.org"
        End With '.document
        
        .Quit 'close the application window
    End With 'ie
    
End Sub

到底是怎麼回事?這裡的關鍵角色是 .Document ,即 HTML 原始碼。我們可以應用一些查詢來獲取我們想要的集合或物件。
例如 IE.Document.GetElementsByTagName("title")(0).innerHtmlGetElementsByTagName 返回一個 HTML 元素集合,其中包含“ title ”標記。原始碼中只有一個這樣的標記。該集合從 0 開始。因此,要獲得第一個元素,我們新增 (0)。現在,在我們的例子中,我們只想要 innerHtml(一個 String),而不是 Element Object 本身。所以我們指定我們想要的屬性。

點選

要關注網站上的連結,我們可以使用多種方法:

Sub IEGoToPlaces()
    Dim IE As InternetExplorer 'Reference to Microsoft Internet Controls
    Set IE = New InternetExplorer
    
    With IE
        .Visible = True
        .Navigate2 "http://www.example.com"
        Stop 'VBE Stop. Continue line by line to see what happens.
        
        'Click
        .Document.GetElementsByTagName("a")(0).Click
        Stop 'VBE Stop.
        
        'Return Back
        .GoBack
        Stop 'VBE Stop.
        
        'Navigate using the href attribute in the <a> tag, or "link"
        .Navigate2 .Document.GetElementsByTagName("a")(0).href
        Stop 'VBE Stop.
        
        .Quit 'close the application window
    End With
End Sub

Microsoft HTML 物件庫或 IE 最好的朋友

為了充分利用載入到 IE 中的 HTML,你可以(或應該)使用另一個庫,即 Microsoft HTML Object Library 。在另一個例子中更多關於這個。

IE 主要問題

IE 的主要問題是驗證頁面是否已完成載入並準備好與之互動。Do While... Loop 有幫助,但不可靠。

此外,使用 IE 只是為了刮取 HTML 內容是 OVERKILL。為什麼?因為瀏覽器是用於瀏覽的,即顯示包含所有 CSS,JavaScripts,圖片,彈出視窗等的網頁。如果你只需要原始資料,請考慮不同的方法。例如,使用 XML HTTPRequest 。在另一個例子中更多關於這個。