HTML – How to Join Texts and Optimize Formatting with Algorithms

algorithmformattinghtml

I have text objects (text1, text2, etc) with formatting information (eg. [bold, italic]). Now I want to concat these texts and format them in HTML.

For a simple case it would transform the following

text1[bold,italic]
text2[]
text3[bold]

into this HTML:

<b><i>text1</i></b>
text2
<b>text3</b>

But I want to join formattings where possible, not format each text individually. For example the following

text1[bold,italic]
text2[bold]
text3[]

Should result in this HTML:

<b>
  <i>text1</i>
  text2
</b>
text3

Additionally I want to wrap the "longest" formatting to the "outside" of the DOM. For example in the following case, italics is the longer formatting chain and thus wraps around the bold element:

text1[bold,italic]
text2[italic]
text3[]

Result:

<i>
  <b>text1</b>
  text2
</i>
text3

What would be a good algorithmic approach to solve this problem? Answers in pseudocode or any programming language would be helpful.

Best Answer

For minimizing the total number of tags you have to write, the simple greedy algorithm is optimal. At each position in the text:

If any formatting is turning off, write out end tags matching the preceding start tags until all the required formatting is turned off; then
Write start tags for any formatting that needs to turn on, in descending order of the length of the following text that it covers.

Related Solutions

CSS – Setting Cellpadding and Cellspacing in HTML Tables

Basics

For controlling "cellpadding" in CSS, you can simply use padding on table cells. E.g. for 10px of "cellpadding":

td, th { /* table cells */
    padding: 10px;
}

For "cellspacing", you can apply the border-spacing CSS property to your table. E.g. for 10px of "cellspacing":

table { 
    border-spacing: 10px;
    border-collapse: separate;
}

This property will even allow separate horizontal and vertical spacing, something you couldn't do with old-school "cellspacing".

Issues in IE ≤ 7

This will work in almost all popular browsers except for Internet Explorer up through Internet Explorer 7, where you're almost out of luck. I say "almost" because these browsers still support the border-collapse property, which merges the borders of adjoining table cells. If you're trying to eliminate cellspacing (that is, cellspacing="0") then border-collapse:collapse should have the same effect: no space between table cells. This support is buggy, though, as it does not override an existing cellspacing HTML attribute on the table element.

In short: for non-Internet Explorer 5-7 browsers, border-spacing handles you. For Internet Explorer, if your situation is just right (you want 0 cellspacing and your table doesn't have it defined already), you can use border-collapse:collapse.

table { 
    border-spacing: 0;
    border-collapse: collapse;
}

Note: For a great overview of CSS properties that one can apply to tables and for which browsers, see this fantastic Quirksmode page.

JavaScript Screen Size – How to Get the Size of the Screen, Web Page, and Browser Window

These days, for screen size you can use the screen object:

window.screen.height;
window.screen.width;

Legacy

You can get the size of the window or document with jQuery:

// Size of browser viewport.
$(window).height();
$(window).width();

// Size of HTML document (same as pageHeight/pageWidth in screenshot).
$(document).height();
$(document).width();

Best Answer

Related Solutions

CSS – Setting Cellpadding and Cellspacing in HTML Tables

JavaScript Screen Size – How to Get the Size of the Screen, Web Page, and Browser Window

Legacy

Related Question