A gender analysis program developed by Stevens Institute of Technology researcher Na Cheng and colleagues could have successfully determined the sex of a 40-year-old U.S. man writing online as a gay Syrian girl, according to tests.
The software permits users to either upload a text file or paste in a paragraph of 50 words or more for analysis. The program was based on a vast corpus of documents that the researchers screened for psycholinguistic factors, and they winnowed the more than 500 factors they uncovered down to 157 gender-significant ones. These cues were then combined by the program through a Bayesian algorithm that guesses gender according to the balance of likelihoods suggested by the factors.
The program has three gender judgments to choose from--male, female, and neutral. A judgment of neutral might signal that someone is attempting to write in a gender voice that is unnatural to them. When fed text, the software's assessment of a male or female author is only precise 85 percent of the time, but the researchers say its accuracy will improve as more people use it and alert it to wrong guesses.
From New Scientist
View Full Article