Reused headings in HTML source file

Re-importing HTML and Word Source Files

In some situations your source files may contain content that will be split into multiple topics rather than just creating a single topic. In addition, depending on the type of content you are working with, the source file may contain a number of "common" headings, resulting in several topics that have the same heading text.

Best Practice: We recommend using "unique" file names rather than using "common" file names or "common" headings when you are working with the text portions of your documents. Note, images should always use unique file names.

This topic describes how the Importer deals with common headings in a modified source file, resulting in a change to the content's structure.


Let's look at a scenario where the HTML source file is imported, modified, and then re-imported. When Author re-imports the source file it checks the content so it can identify when to overwrite existing objects and when to create new objects. When "common" headings are used, these actions are based on the combination of the heading and its location in the source file rather than the combination of the heading and its topic content.

The sample HTML source file

Before we look at the re-importing process let's look at how the source file is going to be modified.

  • The first import is going to use the file on the left. It contains an "Introduction" and a "Description".

    We can think of the original Introduction heading being "mapped" as "location 1" in the source file.

  • The file on the right shows the new content that has been added at the start of the page. This change has resulted in the original content changing its location so it now appears at the end of the page.

    When we look at the modified file we see that there is still an Introduction heading at "location 1" but now there is also a new Introduction heading, we'll call this part of the file "location 2".

Modifying Source File with New Common Heading

How Author imports the source file in this example

When the original file is imported Author-it finds the Introduction heading and creates a topic (code 1459). Using metadata, this topic will be mapped to our "location 1" from the source file, or the first instance of an "Introduction" heading. Author-it then adds the content that if finds under this heading to the topic.

Key point for this import: Topic object 1459 uses metadata that maps it to the first instance of an Introduction heading, the content in the topic is not relevant.

First import of HTML source file

Book structure

HTML Source File with Reused Headings_First Import

Book with Reused Headings_First Import

When the modified file is re-imported Author finds there are now two Introduction headings. There is still an Introduction heading at "location 1" so it will re-use topic 1459. As part of the re-import process Author-it overwrites the topic's content using the new text from the modified source file.

As Author-it continues the re-import process it finds another instance of an Introduction heading so it creates a new topic (code 1461) for this location in the book. It applies metadata that identifies this topic as the second instance of an Introduction heading (and again, ignoring the content under the heading). Author then copies the text under this heading into the topic.

Key point for this import: Author now finds two instances of the Introduction heading (ignoring the content under the heading). It re-uses the existing topic object for the first instance of the heading and creates a new topic object for the second instance of the heading.

Re-import of modified HTML source file

Book structure

HTML Source File with Reused Headings_Updated

Book with Reused Headings_Updated

If the file was modified again by adding another "Introduction" and re-imported then Author would go through the same process to identify the common headings and their locations in the file and using the metadata assignments it would decide when to reuse an existing topic or create a new topic.

Final result of these actions

Topic 1459 was created for the first import with the "original" Introduction text. On the re-import the topic object was re-used at the same location in the book, but the content was overwritten with the new "Introduction" text. A new topic (object 1461) was created for the second instance of the heading in the source file and now contains what had been the "original" Introduction text.