Automatic Readability Analysis

I started applying natural language processing technology to improve the editing process in the early 1980’s. I wrote a program for my employer that computed the Flesch Readability Index. It also identified sentences and words that were hard to understand.

This program was used by publicly regulated companies like utilities and insurance companies. They were required comply with state regulations on “ease of understanding” of their documents. I also applied it to all of the tutorial and marketing materials that I wrote for that employer.

There are readability analysis tools of varying quality that are widely available today. Microsoft Word will compute the Flesch readability index of a document. Several websites offer more comprehensive services. I use some of them in the process of editing.

Automatic Index Generation

Computer science has come a long way since then. I continue the research into natural language processing that I began over fourteen years ago. At that time, I earned an M.S. from the University of Texas at Dallas in Applied Cognition. I specialized in human-computer interaction and natural language processing.

One of my graduate research projects was a program that would generate a subject index for a book. I have improved this program over time. I benchmark it against indices created by professional indexers, including an index for one of my own books.

A non-fiction book without a subject index is not a serious work. The manuscript I deliver to you will already have a very good index, generated by that program.

As a result, you’re going to have a better book, and you’re going to save your publisher some money. You should be able to negotiate something extra from your publisher, since you saved them the cost of hiring an indexer. Perhaps you can get more gratis copies of your book, or more money spent on cover art.

Automatic Semantic Analysis

Another one of my graduate research projects was a program that analyzes text and builds a “knowledge representation” version of the text. It can find texts that are hard for people to understand, due to issues like grammatical complexity and semantic ambiguity. I have improved this program over time by processing large quantities of text written about cyber-security. This is one of my other areas of technical expertise, so I have access to lots of relevant material.

The manuscript I deliver to you will have been processed by this program. I use it as I write. It finds problems like ambiguous grammar, unidentifiable topic sentences, or pronouns with ambiguous referents. When it finds one of these, I rewrite the offending sentences or paragraphs. You will never see the text it objects to, but your manuscript will be easier to understand because I used it.



button: Contact Us Now