2016年8月11日星期四

How to check if there is a page break issue


In the pharmaceutical industry, one of the most annoying tasks is to check if there is a page break issue. Programmers have to open every single RTF file and look into page one by one. It’s time-consuming especially when there are a lot of RTF files.

Before showing how to do this task in VBA, I’d like to give you a brief introduction on how the issue occurs. As we all know, BREAK STATEMENT together with PAGE BREAK VARIABLE (usually a group or order variable) in PROC REPORT can be used to control page breaks. The report will start a new page (let’s call it “Virtual Page”) when values of page break variable change and we expect that the size of Virtual Page is less than or equal to that of Physical Page (what we see in the Microsoft Word). However, the content in a Virtual Page will span on more than one Physical Page if the size of Virtual Page is larger and thus lead to the occurrence of page break issue.

We can imply from above that the total number of Virtual Pages will not equal to that of Physical Pages if there is a page break issue. Fortunately, we can calculate both Virtual Page and Physical Page via Excel VBA. It is easy to return total number of Physical Pages. But counting total number of Virtual Pages requires little trick. In fact, there is a one-to-one relationship between Virtual Pages and Sections.  And we can count total number of sections in a word document. Therefore, by counting total number of sections, we can get total number of Virtual Pages.

So far, you must know how to examine one specific RTF file. However, programmers need to check all RTF files before each delivery.  What should we do when it comes to multiple RTF files? Below are the features incorporated into my macro to solve this issue:



1

Copy all file names within one folder into Excel

2

Check RTF file one by one and put total number of Physical Pages and total number of Virtual Pages in Excel

3

Compare total number of Physical Pages against total number of Virtual Pages to determine if there is a page break issue

Following is the full program and explanations:



Sub PageBreak_Click()

'Define object

 Dim objFSO As Object

 Dim objFolder As Object

 Dim objFile As Object

 Dim i As Integer

 Dim j As Integer

'Find the last row with data

lastrow = ThisWorkbook.Worksheets(1).UsedRange.Rows.Count

'Clear content

For i = 2 To lastrow

  For j = 2 To 5

    Cells(i, j) = ""

  Next j

Next i

'Create an instance of the FileSystemObject

Set objFSO = CreateObject("Scripting.FileSystemObject")

'Get the folder object

inpath = Trim(ThisWorkbook.Worksheets(1).Cells(2, 1))

Set objFolder = objFSO.GetFolder(inpath)

Dim objWord As Object

Set objWord = CreateObject("Word.Application")

objWord.Visible = False

i = 2

'loops through each file in the directory and prints their names and path

For Each objFile In objFolder.Files

    WordNam = objFile.Name

    temp = Split(objFile.Name, ".")

    If temp(UBound(temp)) = "rtf" Or temp(UBound(temp)) = "docx" Then

        Set wdDoc = objWord.Documents.Open(inpath & "\" & WordNam)

           Cells(i, 2) = objFile.Name

           'Total number of Physical page

           Cells(i, 3) = wdDoc.ActiveWindow.Panes(1).Pages.Count

           'Total number of Virtual Page

           Cells(i, 4) = wdDoc.sections.Count

        If wdDoc.ActiveWindow.Panes(1).Pages.Count <> wdDoc.sections.Count Then

           Cells(i, 5) = "Y"

        Else

           Cells(i, 5) = "N"

        End If

        i = i + 1

    End If

    wdDoc.Close

Next objFile

objWord.Quit

ThisWorkbook.Save

End Sub

Here shows the final results. You can filter data to see which RTF file has page break issue.

没有评论:

发表评论