punctuation.art

May 21st, 2012

We read in the introduction to Mr. Addad’s jGnoetry:

[…Q]uotation marks, parenthesis, and brackets, […] are tricky to handle in bigram generation systems because you can’t be guaranteed that an open-bracket will have a matching close-bracket.

No argument here. Both the opening and closing tokens of any pair are unlikely to fall within the short range of any interesting n-gram model.

But, what if we modify the model, to make it want to close off the pair?
Translating “want” into some sort of algorithmic model, to increase the chance of closing off the pair.
read more…

sensitive.art

May 10th, 2012

I continue to tweak the TextMunger.

Currently, it’s case-sensitive all the time, which means that ALL CAPS words and phrases do not mingle well with lower-case words and phrases. So, I thought, make it NOT case-sensitive, and ALL CAPS sentences will intermix!

Which, yes! They do!

The Law of Unintended (or Not Realized Ahead of Time) Consequences says that it will also lead to a loss of recognition of the start and end of sentences, which was helped by U&LC recognition. This being a dumb-as-statistical-sticks Markov-chaining.

Witness:
read more…

  • syndicate

    • Add to MyMSN
    • Add to MyYahoo
    • Add to Google Reader
    • Add to Bloglines
    • Add to Newsgator
    • Add to NewsIsFree