PDA

View Full Version : HTML File



amy
October 28th, 2003, 04:30 PM
The following is the output from a pdf file converted to html. I am unable to convert this to table format because of the multi-line fields.

When the file is an html or it looks like the following: (Os are used for blank spaces)
A B C
1. X-XOXXXXOOXXXXXXXXXXXXXXXX
OOOOOXXXXOOXXXXXXXXXXXXXXXX
OOOOOOOOOOOXXXXXXXXXXXXXX


2. X-XOXXXOXXXXXXXXXXXXXXXX
OOOOOOOOOXXXXXXXXXX


Fields 1C and 2C could be different widths and have differing heights. Is there a way to use monarch to recognize the wrapped text and differing field widths?

If the file exists as a text file then each records exists of several lines with no way to distinguish one field from the next (as most fields are full-text fields?)

Ex.

1.XXXXX
XXXXXXXXXXXXXXXXX
XXXXXXXXXXX
XXXXXXX

2.XXXXXXXXXXX
XXXXXXXXXXXXXXXXXX
XXXXXXX
XXX

I have read the directions and can't figure it out. HELP PLEASE!

[ October 28, 2003, 03:35 PM: Message edited by: amy ]

Grant Perkins
October 28th, 2003, 05:36 PM
Amy,

To clarify, is the sample still in the HTML file or have you extracted from the HTML to a table but ended up with multiple rows where you wanted just one?

Also I can't quite work out what you are explaining with the examples and the requirement for 'differing heights'. Probably just me being dumb but clarification would be useful if possible.

If there is a possibility that you could provide a sample of the file you need to work with (HTML?) I would be happy to try to work out a way forward.

Let me know and I will send you a Private Message to provide my email address to which the file can be sent.

Grant


Originally posted by amy:
The following is the output from a pdf file converted to html. I am unable to convert this to table format because of the multi-line fields.

When the file is an html or it looks like the following: (Os are used for blank spaces)
A B C
1. X-XOXXXXOOXXXXXXXXXXXXXXXX
OOOOOXXXXOOXXXXXXXXXXXXXXXX
OOOOOOOOOOOXXXXXXXXXXXXXX


2. X-XOXXXOXXXXXXXXXXXXXXXX
OOOOOOOOOXXXXXXXXXX


Fields 1C and 2C could be different widths and have differing heights. Is there a way to use monarch to recognize the wrapped text and differing field widths?

If the file exists as a text file then each records exists of several lines with no way to distinguish one field from the next (as most fields are full-text fields?)

Ex.

1.XXXXX
XXXXXXXXXXXXXXXXX
XXXXXXXXXXX
XXXXXXX

2.XXXXXXXXXXX
XXXXXXXXXXXXXXXXXX
XXXXXXX
XXX

I have read the directions and can't figure it out. HELP PLEASE!

Tom Whiteside
October 28th, 2003, 10:11 PM
Amy,

Using your second example of a text file format, estimate the maximum field width before the forcible word wrap - - in your case, it looks to be record 2, line 2, with 18 characters. Use a multiple-line field trap with Advanced Field Properties of End Field On Blank field values: 1.

Unless I'm terribly off-base, this should give you records 1 and 2 as single-line fields, each with four "pieces," separated by 3 spaces. Now, if this is indeed your situation, you could concatenate the four pieces with something like the following:

New_Field=RTrim(LSplit([your_field],4," ",1))+RTrim(LSplit([your_field],4," ",2))+RTrim(LSplit([your_field],4," ",3))+RTrim(LSplit([your_field],4," ",4))

This seems to work okay for Monarch 7. If you are using Monarch 6.01, or, if your 3 spaces do not disappear, then take a look at the posting
topic (http://mails.datawatch.com/cgi-bin/ultimatebb.cgi?ubb=get_topic;f=1;t=000327#000005) "next door" to this one, namely, Hopefully Simple Question.

Please advise if I'm not grasping your situation. The way I'm reading your problem is that you (1) need to capture multiple-line record fields, and (2) strip out any remaining spaces and concatenate the pieces into one field per record.

Hope this helps....

[ May 19, 2006, 12:26 PM: Message edited by: Todd Niemi ]