Communication is essential!
The National Geologic Map Database (NGMDB) is a Congressionally-mandated project that is designed to build a suite of databases and standards to support the public's use of geoscience map information. The Chief of the NGMDB (Dave Soller, U.S. Geological Survey) regularly meets with colleagues in other geological surveys, and especially with other Federal surveys in which similar work is being planned or conducted. The purpose of these meetings is to share information and to improve the design and quality of the databases and standards under development within the NGMDB and other agencies. To most geoscientists, the terminology and concepts of Information Technology and database design are relatively new and unfamiliar. Therefore, it can be especially difficult to convey the subtle meaning of these technical terms and concepts to colleagues who speak different languages.
Our translators and methodology:
To help address this problem, Guy Pinhassi suggested that, if selected NGMDB documents could be translated into other languages, colleagues in other countries could more easily understand the NGMDB plans and database design. At that time (July, 2001) Mr. Pinhassi was an undergraduate Geoscience student at the University of Arizona (in Tucson, Arizona) and also worked in the ESPRI GIS Lab. (ESPRI, the Earth Surface Processes Research Institute, is a joint venture between the USGS and the University of Arizona. John Sutter, Chief Scientist of the USGS Western Earth Surface Processes Team and Co-director for Development of ESPRI, and Bear Pitts, Associate Director for Development of ESPRI, were instrumental in securing funds and facilities for this work.) The NGMDB Chief enthusiastically supported this idea and asked Mr. Pinhassi to proceed with a prototype. After some study of the problem, Pinhassi proposed a combination of machine- and human-translation of selected NGMDB documents into Spanish and began assembling a team to run a prototype of this process. [Pinhassi's initiative is especially impressive because Hebrew is his first language and he does not speak Spanish.]
During summer 2002, Pinhassi, assisted by Barton Cutter, another UofA undergraduate, was able to engage participation by Benjamin Lopez of the Youth Volunteer Corps (YVC) and the Tucson Hispanic Chamber of Commerce to find Spanish speaking "youth at risk" who would benefit from exposure to this type of project. He recruited Harry McGregor of the Open Source Education Foundation (OSEF) to develop the distributed computational platform on which the prototype could be run. The project invested in a copy of the SYSTRAN Professional Premium software package (SYSTRAN) to use as the machine translation tool.
Finding motivated high school students who could work with this project proved a challenge until Pinhassi met Jonathan Levi, Assistant Director of the University of Arizona's National Center for Interpretation (NCI) (http://nci.arizona.edu/). Levi had just completed a summer session of the Professional Language Development Project (PLDP), whose focus is developing the language skills of bilingual students to meet the increased demand for bilinguals in health care, education, business, law enforcement, social service agencies, and the court system. Levi needed internship positions for the PLDP graduates, and Pinhassi's project offered an excellent opportunity.
A group of PLDP graduates, fourteen high school students, began the project in September of 2002. Four students completed the prototype project in June, 2003: Claudia Cota, Natalia Dabdoub, Isis Urtusuaztegui, and Luis Jimenez. The other ten students who participated in the project were: George Apodaca, Jose Barajas, Fernando Barrera, Magdalena Chavarria, Eric Figueroa, Gladisnis Figueroa, Arelis Velasquez, Geni Peregrino, Sandra Lopez, and Claudia Valenzuela.
The project began with machine translation of the technical documents. Although the SYSTRAN software is highly regarded and the tool was able to accurately translate between 70% to 80% of the document, one word at a time, it cannot evaluate the context of the word, and therefore the output was unuseable. Further, geologic and technical terms were not included in the software's vocabulary. At times, the students found the machine translation to produce entertaining, amusing results.
Because machine translation of these documents did not produce a useful product, we needed to identify a highly literate bilingual geologist to proof the student's work. The project was greatly enhanced with the recruitment in January of 2003 of Guillermo Garcia, a native Spanish-speaking geologist familiar with English to Spanish translation of technical and scientific subjects due to his experience in these subjects in Europe and Latin American countries. Presently, Garcia is a PhD student in the University of Arizona's Natural and Renewable Resources program.
With Garcia's assistance, the students were able to successfully translate 7 documents that provide a general overview of the NGMDB.
What did we learn from this prototype?
Go to the Spanish language site Return to the NGMDB projectThis page is <URL: http://ncgmp.usgs.gov/ngmdbproject/reports/reports-esp-enginfo.html>