The History of Natural Language Processing

Hello! I am a student in the History of Computing class at San Jose State University (http://www.cs.sjsu.edu/~mak/CS185C/). This is a work in progress that will turn into a final article by the end of the semester. I welcome your comments and advice!

My official topic is: "The history and development of text based natural language processing." It is a very interesting topic to me and I have plans to go to the SRI with in the next two weeks or so and I will be posting more information as I go!

If any one has information they would like to share please email me at rlichtig@gmail.com, or you can make a discussion about it.

Thank you very much and all input is helpful!

- Ryan Lichtig Sophomore SJSU Computer Science Major

General Format: I. Intro II. First attempt at machine translation. Stats and Linguistics. III. Second attempt, pure statistics. IV. Third attempt, current. Statistics and Linguistics. V. Conclusion, where NLP is headed.

Rough Draft of Introduction and first section:

Talos, in Greek mythology, is the guardian of Europa and her land of Crete. Forged by the divine smith Hephaistos; Talos is an automaton, an autonomous machine of bronze that patrolled Europa’s land protecting it against enemies and invaders. This divine guardian and deity generated the idea of synthetic life and intelligence, but this idea was only that: a concept. The capability of creating such magnificent devices was left to the Gods themselves, something no human could ever achieve. However, thousands of years later during 1818, Mary Shelly immortalizes Frankenstein’s monster and changed the idea from something divine to something human by creating artificial life and intelligence through the medium of science. Now is when artificial life and intelligence begins, now something man-made could be created and it would have the ability to live, to learn, and to, most importantly, adapt. While Mary Shelly’s novel was purely fiction it allowed for the thought of human made synthetic life to take over, rewriting the idea long instilled by the Greek myths.

In 1946 during World War II another major advancement took place, the creation of Colossus. This computer, although kept secret for years by Great Britain, electronically decrypted German messages encrypted by the Enigma machine. Colossus could be considered one of the first modern computers, a technology that allowed a super human amount of calculations to occur in a relatively small amount of time. With biological science proving ineffective for creating synthetic life, humanity moved to technology and computers in their quest for artificial life and intelligence. Shortly after World War II had ended came the Cold War with Soviet Russia. The fear of nuclear war and Soviet spies sparked the development of natural language processing, beginning with the translation of the Russian language, both spoken and written, to English and culminating with modern marvels such as Watson and the mobile device application Siri.

With tensions wrung high and missiles at the ready, natural language processing was invented, focusing on machine translation. A formal definition of machine translation is “going by algorithm from machine-readable source text to useful target text, without recourse to human translation or editing” (Automatic 19). This tool was considered vital to the United States government because, when fully developed, it would enable them to translate the Russian text to English with a low chance of error and at speeds faster than humans. Because the government needed it, funding was readily available.

In 1954, Georgetown University and IBM came together to perform an experiment converting more than sixty Russian sentences to English using the IBM-701 mainframe computer. Their method of machine translation was to use computational linguistics, a combination of statistics and rules of language. The researchers claimed that there were problems with machine translation but that they would be solved within the next three or four years, however things grew increasingly difficult.

The United State’s National Research Council, NRC for short, founded the Automatic Language Processing Advisory Committee, ALPAC for short, in 1964. ALPAC was the committee assigned to evaluate the progress of NLP research. In 1966 ALPAC and the NRC halted research on machine translation because progress, after ten years and twenty million dollars, had slowed and machine translation had became more expensive than manual human translation. This first major setback in AI was due to funding; they could not get money required to do research, and the first attempt at machine translation had failed.

Please let me know if you have any suggestions!

Comments

Not quite getting the connection your trying to establish between the quest for artificial life and natural language processing. Wouldn't the development of AI preclude the need for natural language processing, as a truly "intelligent" AI would implicitly understand language in a human way? Natural language processing would seem to only be relevant in a world where we are trying to make non-sentient machines process language in intelligent and natural ways.

Nor do I think there is a strong case that natural language processing, a la Watson, equates to Artificial intelligence/life in the broad sense you seem to imply?

--Srterpe 14:01, 11 December 2011 (EST)

Srterpe-

Thank you for your input, and this is merely the Introduction. The main idea that I was trying to show was simply the development of artificial intelligence through out time and the change of the concept from some divine to something possible for humans. However I understand your point, but there seems to be a misunderstanding: AI is a very broad subject and NLP is just a tiny branch of it. There are AI machines that simply observe the world and find a way through an obstacle course, like Shakey. While there are others that learn chess or drive on the deserts autonomously. In this introduction I am introducing a unification of the artificial life and intelligence to show the idea of Machine Learning that will be brought up in the next paragraphs.

I believe Watson was actually a very important step in NLP because it took the questions asked, deciphered them, then found a response through its databases (not connected to the internet). All of these processes are found in NLP in one form or another and so it is an adequate model of a modern marvel. However if there is another one that you believe is better please do not hesitate to inform me!

I hope that clarifies things, and thank you very much for your comment!

- Ryan Lichtig

No, I agree entirely, Watson is an amazing example of natural language processing. Just not sure that we're any closer to truly sentient machines or synthetic life as a result.

Anyway I don't want to keep you from finishing your paper, I was just cruising around seeing what others in class were doing.

--Srterpe 21:14, 11 December 2011 (EST)

Oh, I see and you are completely right. We are not very close to sentient beings and truly artificial life. But, as will appear most likely at the end of my essay, we seem to be getting there and it might be sooner than we think.

Good luck on your project too.

- Ryan Lichtig