PDF Viewer/Scanner/Importer Screen with Database Storage

unclepetecorner
This week’s Uncle Pete’s corner will be looking at my PDF Viewer/Scanner/Importer Screen. We will be storing the images in a Hyperfile C/S database. This will be 100% native WinDev code no ActiveX or .NET required!

One of my very early projects involved document storage. That was a long time ago! The documents weren’t stored on stone tablets but lets just say the technology has changed quite a bit since I got started. Since those days, nearly every application I have created I have added document storage capabilities. It’s been one of those bell and whistle features that has often move my applications to the top of the list when doing demos. And one of the most used features, even by clients that originally didn’t think they wanted document storage. As you will see in this article, with some forethought to the design you can have a document storage solution that you can incorporate into your projects very quickly.

I actually have a few different document storage solutions and approaches depending on the projects needs.

One of the first decisions to make is whether you will store the images in the database or as physical files. Each has advantages and disadvantages.

When storing images in a database some of the disadvantages are:

  1. The database can get very large
  2. In a recovery situation a database is not available until it is fully restored, and with a large database that can take several hours.
  3. Often to perform the required functions on the image a physical file is required, so you often have to move the image from the database to a temporary file, which adds some overhead and time to those functions
  4. The images can not be manipulated outside of your software. Such as archiving or deleting of old images.

And some of the advantages are:

  1. No special network drive access is required. Security is handled via your normal database and application security
  2. When dealing with remote access users #1 becomes even more important.
  3. Backup of the images is handled by your normal database backup processes
  4. The images can not be manipulated outside of your software. Yeah I know its on both the disadvantages and advantages list, just think about it 😉

When storing images as physical files some of the disadvantages are:

  1. A large number of files in an single directory can cause performance issues.
  2. Backups of individual files can be an issue
  3. Any users of the application must have network read/write access to the location of the files
  4. Remote users further complicate #3
  5. Images can be manipulated outside of your application. Combined with #3 this can be very dangerous.

And some of the advantages are:

  1. In a recovery situation the application can be up and running, and the images just not be available immediately. And as the images are restored they become available one by one.
  2. Security is much easier to control
  3. Images can be manipulated outside of your application. Again yeah its on both list, but this is very handy if you want to archive a group of files or remove old files, etc.

For this article we will be storing images in the database. To combat some of the disadvantages of this approach, I setup a separate database (“Document Store”) just for the images database. This allows it to be backup and restored separated from the normal database as well as reside on a separate server if needed.

2014-02-06_2016

The next conversation generally have with my clients is how robust and feature rich of a solution is required. A native WinDev solution is possible but it lacks some of the higher end features and performance of my AcitveX solution, such as rotating images, editing individual pages of a multipage document, performance issues on large PDF files, handling of more file formats, etc.

Today we are looking at the Native WinDev solution.

So let’s get started. The “Document Store” database only has one table in it. Let’s look at that structure.

2014-02-07_0629

First it has a Auto-incrementing primary key (DocumentID). What self respecting database table doesn’t?

Next is Document, which is the binary field that will actually hold the image.

LinkTable and LinkID, allow a document to be attached to any record in the database. Remember I wanted this solution to be something that I could easily drop into any project. Later we will see these used along with the magic of indirection to provide a portable solution.

DateAdded and AddedByPersonID is something that I do with many of my tables for audit (err finger pointing) purposes.

As you can see nothing elaborate with the table design at all. Let move on to the Viewing/Scanning screen.

You don’t get as old as I am in this field without learning a few rules to live by.

Rule # 14, Never write anything from scratch, if you can borrow something and modify it to your needs. 

One of the first places I look for a likely suspect to borrow from is examples that came with WinDev, and in this case there is one that fits the bill nicely called “WD PDF Viewer”. So I started with that screen and then changes things around to suit my purposes. A word of warning about the examples however, many of them are teaching examples and/or were written in older versions of WinDev, so the code isn’t always production level code, I always review and modify all the code, I just use the examples as a springboard for my solution, never a drop in solution.

Well I guess I have stalled enough lets move on and look at the actual solution. Here is a screen shot of the Viewer/Scanner in a production application.

2014-02-07_0651

So let’s start picking this screen apart and looking at the code that makes it work, starting with the Global Declaration code.

2014-02-07_0657

It accepts 3 parameters:

  1. DocumentID – the ID of the document record we are viewing / scanning. If 0 it means we will be creating a document record.
  2. LinkTable – A string value with the name of the table this record will be attached to.
  3. LinkID – An integer value with the Primary key value of specific record this record will be attached to.

The next 4 lines deal with creating a temporary file. Remember I said when we store the image in a database will often need to create a temporary physical file to manipulate the image. Here it is. fTempPath() is a WX function to return the users current temporary path, and fTempFile generates a random name for a temporary file. fTempFile always generates the file with a .tmp prefix so I simply change that to .PDF. Some utilities etc. can’t recognize a file as PDF unless the extension is .PDF. And finally I issue a delete statement just in case the file already exists. Remember this is the users temporary directory so I am not that worried about deleting a file that may have existed, just a bit of belt and suspenders style programming, that is my trademark.

The IF statement determines if we are viewing a document that was already created or if we are creating a new document. For an existing document we fetch the record from the database, then create a file using the temporary filename we created above, and move the contents of the binary field into the file. Then we close the file and set the image control to name of the temporary file. Otherwise we set the image control to an empty string so that no image is displayed.

And finally line 8 calls a local procedure that does the work of the initial display of the image, handling page number, thumbnails, etc. So lets take a look at that procedure.

2014-02-07_0716

We set the page number to 1, retrieve the total number of pages from the image control and set the static on the screen to that value.

Next we handle the looper for thumbnails. We start by clearing the existing looper if there is something in it, then we loop through the number of pages and assign the attributes. Let’s look at the looper definition.

2014-03-07_0908

The first attribute is the image control itself and we make it the same as the Image Control for the main display. Then we set the pagenumber property of the attribute of the image control, so that it displays the each page, and finally we set a static to the number of the page so that it displays the page number under the thumbnail. If we look at the Thumnail Image control in the looper we will see that it is declared as Homothetic centered, which means it will be proportionately shrunk to fit the image control. Therefore producing a thumbnail.

2014-03-07_0910

And that is all there is to displaying a list of thumbnails for the pages of the document.

Back to our Display Image Code. Line 14 checks to see if we have a document or not. If we do it sets a group of controls to active, otherwise it sets them to grayed. Groups are a handy way in WinDev to group several controls together so that you can handle their properties as a group. From the modification tab, select groups and you can see all the controls that are in the group. Without the group we would have to code a separate statement for each one of those controls to enable or disable them.

2014-03-07_0919

To add a control to a group, right click on it and use the Groups option from the popup menu.

2014-03-07_0921

Finally on Line 21 of our DisplayImage() procedure it calls another procedure AssignZoomComboBox(). So lets take a look at that procedure.

2014-03-07_0923

First notice it has a parameter nZoom, this specifies the Zoom percentage. And it also has a default value of -1, mean if the parameter is not passed it will have a value of -1. The call inside DisplayImage() did not pass a parameter so in this instance nZoom will indeed be -1.

If nZoom is -1 the the nZoom is set to the current Zoom property of the image control. This means that whatever zoom percentage was used to get the document to fit into the image control is what nZoom will be set to.

Next we use ListSeek to see if that value is in the list, if it is then it is selected, otherwise the combo is set to that value. This is why when the screen first options a zoom value is display in the combo box that isn’t one of the valid values defined. Notice in our original screen shot the zoom value shown is 19, however if we look at the definition of the combo control, we see that is not one of the possible values.

2014-03-07_0931

Another key to the combo box definition that allows this to happen is setting it to being editable.

2014-03-07_0942

Notice when we drop down the combo box, 19 is not shown as an option.

2014-03-07_0932

Let’s take a look at the code behind the toolbar button next.

We will skip the Scan group for now and look at the formatting buttons first, starting with the “Fit to Height” button

2014-03-07_0934

Along with setting the Zoom property to a percentage, WX gives us some very handy constants, in this case zoomAdaptHeight, meaning fit to height. That’s it no code to figure out the document dimensions and ratio and calculating what size to display. Pass a constant and we are done. Ain’t WX coding fun!!!!

Once we set the Zoom level we call the AssignZoomComboBox() procedure, that we already looked at. Again without passing a parameters, meaning it is going to set the combo box to the current zoom property of the image control. A handy feature of the Zoom property is that even though we set it to the constant zoomAdaptHeight, when we read the property it gives us the actual zoom percentage it is using, in this case 19%.

The Fit to Width and Fit to Page buttons is the same code just using the constants of zoomAdaptWidth and zoomAdaptSize

2014-03-07_0939

2014-03-07_0940

Looking at the code behind the combo box we see that the exact same code is in the Exit from Control (when you type a value) or the Selecting a Row (When you use the drop down)

2014-03-07_0943

It turn the value into an integer value, and then sets the zoom property of the image control.

The first page button, makes sure that we are not already on page 1, then sets the edit control and the PageNumber property to 1, and selects the thumbnail from the looper for that same page.

2014-03-07_0945

The Previous Page button does the same thing, but only decrease the page number 1 at a time.

2014-03-07_0948

As I am sure you guessed the Next Page and Last Page are more of the same except incrementing the Page. Notice that the Last page uses the Static stcTotalPages that we set in the DisplayImage() procedure in order to get the number of the last page.

2014-03-07_0949

2014-03-07_0950

The code behind the entry field for page number is again very similar, but this time as you enter the number, we set the PageNumber property to the value you entered.

2014-03-07_0952

The next button that looks like a cog is the select scanner button. A user could possibly have more than one Twain device and we need to make sure we use the correct one. Prepare yourself for the massive about of code this is going to be.

2014-03-07_0955

That’s right one line of code. WinDev gives us native Twain functions, so no .net library, api calls, dlls, activex or anything else to hurt our brains, just a simple call to the native function.

There is no code behind the Show Scanner Setup code, it is just used to trigger showing the scanner setup dialog as we will see shortly.

Back to the scan group, we will leave the scan button for last. Let’s look at the import document button first.

2014-03-07_0958

This using the fSelect function to open a file selection dialog and let the user select a PDF file from the drive. If they do select a file, then we copy that file to the temporary file name we declared in the initialization code of the window. We set the Image control to an Empty String and then back to the FileName. This is a little trick to get it to sense that the contents of the file has changed. And then we call our DisplayImage() procedure to setup the screen again. Note this function is completely replacing the document. You could expand this logic to insert pages, etc. The reason I don’t is generally when I need those more robust features I generally also need the more robust features of my ActiveX version of the document viewer/scanner, so I keep the feature set of this version simple and easy to support.

The save a copy button is sort of the opposite of the code we just looked at. It uses fSelect to prompt for the name to save to, then uses fCopyFile to copy the image to that file name.

2014-03-07_1003

The Print Button actually has 33 lines of code, which is a lot for WX code!

2014-03-07_1006

The first 3 lines setup some variables, notice we are using a WX function fTempPath to find out the windows temporary path.

Line 5 calls the Printer Setup Dialog of Windows and as long as the user doesn’t cancel we fall into our print logic

Line 7 sets a second Image control to the value of the first Image control. This second image control is hidden we are just going to use it to resize the image for printing and we don’t want the image manipulation to be visible on the screen.

Line 10 uses another built in WX function BitMapInfo to retrieve various information about image files.

Lines 11 and 12 use the information retrieved by BitMapInfo to set the the Width and Height of our hidden Image control to the full width and height of our document. We do this so the remaining code is using the highest resolution version available.

Line 14 sets up a loop for the total number of pages.

Line 15 changes the image control to the current page of the loop

Line 16 declare as temporary file name that includes the page number as part of the name (C:windowstemp1_pdf.jpg, etc)

Line 18 uses another WX native function DsaveImageJpeg to have the current page of the image control to disk.

Line 19 uses the iPrintImage function to print that image while proportionately scaling it, to the printer selected by iconfigure().

The IF state on Line 21 checks to see if there are more pages and if there are issues an iSkipPage() statement so we only get one image per page.

Line 25 actually triggers windows to do the printing once the loop has completed.

Lines 28 – 31 just do a second loop to delete the temporary files. If you delete them during the first loop you will get a print out with blank pages. Go ahead ask me how I know 🙂

Finally Line 32 clears the hidden image control, this is done to avoid it hold an image in memory and possibly locking the disk file.

And that finally brings us to the scan button. If you watched the webinar that goes with this article, you know that we discussed a few different options for this button. Ben Riebens mentioned he has a similar function but uses TwaintoControl. I spent some time testing that method but could not get it to work with the rest of my code. I am sure with some experimenting I could get it to work, however we also discussed that coming with v19 is TwaintoPDF. Since that will be the ultimate solution for me, I decided to just stay with my existing code that is working in production until we move to v19.  So lets take a look at the code.

2014-03-07_1032

Line 10 and 11 (I have some commented code at the beginning from my experimentation) setup some local variables

Line 13 clears any existing temporary files that might have been created by a previous scan.

Line 15 sets up a loop

Line 16 uses another WX native twain function. TwainToBMP, to capture the twain output into BMP files. Notice it builds a dynamic file name that includes the page number (scan1.bmp, etc), you now see the use of the check box control from the screen to trigger displaying the scanner setup screen, we use TwainBlackWhite as this is intended for document storage and that will give us both the cleanest and smallest images. Since we are doing black and white, it is a 1 bit image. We are requesting a 300 DPI resolution, again balancing quality and size. And we take the defaults for contrast and brightness.

If the TwainToBMP function fails Line 17 breaks out of the loop.

Line 19 Increments the Page number counter with each new cycle of the loop.

Line 20 Breaks out of the loop if the Twain function tells us it has completed.

We turn on the hour glass cursor with line 23 so the user knows we are busy working

Line 24 decrements the page counter by one, because the way our logic is setup we would always have a count one greater than the true number of pages.

The IF statement on Line 25 checks to see if any pages were scanned. and if so falls into our logic to process the scans

Line 26 clears our image control

Line 27 deletes our temporary file that holds the image. Again just like the import function this is going to completely replace the existing document.

Line 28 and 29 are setting some printing parameters so that our print out is letter sized. What print out you ask? Well we are going to use another cool feature of WX and print to PDF to convert our scanned images into a multiple page PDF file.

Line 30 and 31 setup a temporary file name for the PDF file

Line 32 deletes that file just in case it already exists.

Line 33 sets up the print function to print to PDF in the filename we just declared.

Line 34 – 41 is a loop very similar to the loop in our print button that loops through and prints each image. However this time we are setting the size based on the paper size of the printer, we want to document scan to fit a letter size piece of paper. And as discussed instead of going to the printer it is going to a PDF file.

Line 42 resets our temporary file to the new PDF file name

Line 43 sets our image control to that file.

Line 44 calls our DisplayImage() procedure to set the screen back up.

And finally at line 47 we turn off the hour glass cursor since we are all done.

That’s it all there is to scanning to PDF. Less than 50 lines of code, and with v19 that will likely be less than 10 lines of code when we can use TwainToPDF!!!

The final code to look at is the window closing code. It simple moves the document to the database field and either updates or add the record to the database bases on whether a document id was originally passed in or not. At line 13 it deletes the temporary file we created for viewing. And at 15 it returns the document id, which is especially useful when adding a new document if your design is to include the document id in the related record. Such as including the document id in the APInvoiceHeader table.

2014-03-07_1055

And there you have it a PDF document viewer and scanner function. Hopefully this is something you can easily add to your applications in the future, who knows it might be the one bell and whistle that lands you that next contract. If so don’t forget to take care of your Poor Uncle Pete 🙂

Be sure to go over to wxLive.us and watch the Uncle Pete’s corner webinar that is the companion to this article.

Uncle Pete’s Corner is weekly webinar on all things WX every Friday, to watch the recorded version of this webinar, many other WX related webinars or to watch future ones live go to wxLive.us

[suffusion-the-author display=’author’]

Pete Halsted[suffusion-the-author display=’description’]

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s