Date of Award

Spring 3-10-2017

Document Type


Degree Name

Doctor of Philosophy in Industrial/Organizational Psychology (PhD)


Industrial/Organizational Psychology

First Advisor/Committee Member

Dr. Dana Kendall

Second Advisor/Committee Member

Dr. Ryan C. LaBrie

Third Advisor/Committee Member

Dr. Christopher Roenicke


employee selection, text analytics, cognitive ability, computer science, biographical data, employee recruiting


Text analytics using term frequency was proposed as an extension of biodata for predicting job performance and addressing criticisms of biodata and predictor methods—that they do not identify the constructs they are measuring or their predictive elements. Linguistic Inquiry and Word Count software was used to analyze and sort text into validated categories. Prolific Academic was used to recruit full-time workers who provided a copy of their resume and were assessed on impression management (IM), cognitive ability, and job performance. Predictive analyses used resumes with 100+ words (n = 667), whereas correlational analyses used the full sample (N = 809). Third-person plural pronouns, impersonal pronouns, sadness words, certainty words, non-fluencies, and colons emerged as significant predictors of job performance (χ2 = 26.01 (10), p = .006). As hypothesized, impersonal pronouns were positively correlated with self-oriented IM (r = .07, p < .05), and first-person singular pronouns were positively correlated with other-oriented IM (r = .07, p < .05), however, first-person plural pronouns were negatively correlated (r = -.07, p < .05). Pronouns and verbs were not predictive of job performance. Positive and negative emotion words did not show hypothesized relationships to OCBs, CWBs, or job performance. Finally, differentiation words (r = .09, p < .01), conjunctions (r = .28, p < .01), words longer than six characters (r = .29, p < .01), prepositions (r = .20, p < .01), cognitive process words (r = .19, p < .01), causal words (r = .20, p < .01), and insight words (r = .06, p < .05) correlated with cognitive ability, but did not predict job performance. An exploratory regression analysis in which cognitive ability as measured by the Spot-The-Word Test (β = .10, p < .05) and a composite of cognitive ability created from text analytics (β = .15, p < .05) both uniquely and significantly predicted job performance (F(1,805) = 18.79, p < .001), demonstrating that word categories can serve as a proxy for cognitive ability. Overall, the method of text analytics sidesteps some of the limitations of biodata predictor methods, while demonstrating the potential to automate resume reviews and mitigate unconscious bias inherent in human judgment.


The impetus for this dissertation came in 2011 while working as an entry-level consultant with a local Seattle consulting company. I was assigned to work on a project with an intellectual property firm to help the US Patent Office more quickly process patent applications. In 2011, it took about 3 years for a patent to be officially accepted or rejected. We used text analytics to try and identify patent applications that should be rejected because the idea had already been patented. Up until that point, I was not aware that text could be used in such a way. I was fascinated with the potential for text to be analyzed and mined for insight and immediately began considering its application to IO psychology as a tool for automating resume reviews.

Initially, I considered text analytics as a tool to add rigor to keyword searches applicant tracking systems (ATS) used to crudely screen resumes, as well as a way to deliver value to organizations by reducing time spent hiring talent, while also protecting applicants from recruiter or hiring manager bias by doing a “blind” resume review. Truthfully, I was more interested in applying the technique to resumes than extending and building on IO psychology theory. After all, text analytics had been used to identify sex (Cheng, Chandramouli, & Subbalakshmi, 2011), mood (Nguyen, Phung, Adams, & Venkatesh, 2014), and even predict stock prices (Bollen, Mao, Zeng, 2010). My rationale was to use the transitive property to argue that if text analytics could be used for those purposes, why not extend its use to evaluating resumes? However, I knew this would not fly; my advisor would never allow such a flimsy theoretical argument as the basis for a dissertation (…and rightly so I might add).

In digging deeper into IO psychology selection research, I happened upon biodata and immediately saw a connection (albeit a tenuous one) between text analytics and biodata, and the rest—well the rest, as they say, is history…or at least I hope so!

Copyright Status

Additional Rights Information

Copyright held by author.


Copyright Status