Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Banalified Moby-Dick #85

Open
jeffbinder opened this issue Dec 1, 2020 · 4 comments
Open

Banalified Moby-Dick #85

jeffbinder opened this issue Dec 1, 2020 · 4 comments

Comments

@jeffbinder
Copy link

jeffbinder commented Dec 1, 2020

I am using the BERT language model to banalify the text of Moby-Dick—that is, to replace Melville's rich language with sentences that better conform to the model's expectations. I'm employing a set of Python functions I have been developing for a variety of text-generation projects, including experiments with rhyme and meter; my code and some more examples are available here.

The algorithm is as follows:

  • The text is divided into chunks of 10 tokens (words counted as single tokens)
  • For each chunk:
    • Run BERT's prediction procedure with each word masked, giving the model 100 tokens of context to the left and right
    • Find the word whose probability is lowest according to the model
    • Substitute in BERT's top prediction for what word should appear there
    • Repeat until no more changes can be made (max 500 loops)

The result is a text that retains much of the sentence structure and some of the meaning of the original, but that uses more normative language and that sometimes diverges into surreal absurdity. Here, for instance, is the famous opening of Melville's novel:

Call me Ishmael. Some years ago—never mind how long precisely—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people’s hats off—then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.

Here is the banalified opening:

Call me Frank. Some time ago—not sure how long ago—with little or no money in my pocket, and nothing else to get me on board, I decided I would go out a little and see the little wonders of the world. It is one way I have of getting off this ship and into the world. Whenever I find myself overly concerned about the weather; whenever there is a cold, wet Feeling in my nose; whenever I find myself deliberately stepping into the street, and knocking off the hat of some person I pass; and then when my thoughts have such an overwhelming effect on me, that it takes a great physical effort to keep me from deliberately stepping into the street, and accidentally knocking people’s hats off—well, I think it is best to get over it as quickly as I can. This is a matter of bow and arrow. With a great effort He supports himself on his staff; And i look towards the ocean. There is nothing wrong with that. If you will believe me, then all those in this city, one way or another, have had much the same view of the ocean as i.

The current version of the function retains the capitalization of the original, which is why it is sometimes wrong.

I also created versions of Hamlet and A Portrait of the Artist as a Young Man. The banalification of the "To be, or not to be" speech is illustrative:

To be, or not to be, what is the choice:
Whether ’tis right of the man to take
The bow and arrow of good fortune,
Or to take arms against a multitude of enemies,
And by deed destroy them? To be—to be,
No more; for without a word of warning we lose
The life-force, and a good many times
The soul is lost also: ’tis a pity
Not to be escape’d. To be, to be.
To dream, not to dream—nay, that’s the trouble,
For at this moment in time what day may come,
When we are released from our mortal bonds,
Will give us hope. That’s the hope
That lets us live a long life.
For who would bear the whips and chains of life,
The king’s wrath, the rich man’s greed,
The lack of god’s mercy, the priest’s abuse,
The abuse of god, and the abuse
Of the soul of a good man,
When that soul is mine to break
With my own hands? What would we then do,
To live and enjoy such a long life,
Knowing that the country of life and death,
The king’s country, to whose land
No one returns, breaks our hearts,
And makes us to forget the love we used
To have for those whom we know nothing of?
For this would make enemies of us all,
And when the bright light of day
Is mingled o’er with the dark shadow of night,
The hearts of men at that time,
Are such that their thoughts turn inward
And seek the peace of sleep. Thank you indeed,
My sweet Ophelia! Hamlet, in thy name
Be my honour be honour’d.

@hugovk hugovk added the preview label Dec 1, 2020
@hugovk
Copy link
Member

hugovk commented Dec 1, 2020

😆

Some time ago—not sure how long ago...

whenever there is a cold, wet Feeling in my nose;

@jeffbinder
Copy link
Author

Wow, that took a long time, even with a GPU. I'm way past the deadline, but it's done: Banalified Moby-Dick.

@cameronedmond
Copy link

This is realy interesting - I love how making Moby Dick more banal turns Ishma-Frank into a hat knockin' vagabond. Nice one, Jeff!

@altsoph
Copy link

altsoph commented Dec 26, 2020

Such a great idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants