Data Science CV-NLP, Summer Intern

  • Lehi, Utah, United States, 84043
  • Full time

About Ancestry:

When you join Ancestry, you join a human-centered company where every person’s story is important. We believe that by discovering the struggles and triumphs of our past, we can foster deeper bonds and more meaningful connections among families and communities. Our talented team of scientists, engineers, genealogists, historians, and storytellers is dedicated to empowering customers around the world from all backgrounds on their journeys of personal discovery. 

With more than 30+ billion digitized global historical records, 125+ million family trees, and 22+ million people in our growing AncestryDNA database, Ancestry helps customers discover their family story and gain a new level of understanding about their lives. Passionate about dedicating your work to enriching people’s lives? You belong at Ancestry.

What you will do:

Ancestry is looking for an exceptional, passionate, and highly motivated Data Science CV-NLP, Intern to join our Data Science Computer Vision & Natural Language Processing team this summer. The Data Science CV-NLP team develops CV and NLP models to extract and organize text and image information from billions of historical and genealogical records. CV models are combined with NLP models to extract and organize information from data to help customers discover and connect with their family history. As a Data Science intern on the Data Science CV-NLP team, you will build and train models that promote product development, customer success, and content creation across our Family History business. You will also work closely with engineering teams to train, optimize, and deploy models. 

  • Implement state of the art Computer Vision methods in document layout analysis, classification, segmentation, object detection, redaction, etc. across various genealogical and historical collections such as newspapers, city directories, family history books, birth, marriage and death records, etc. 

  • Analyze model performance, refine Labeling Specifications and iterate with Labeling resources to curate and refine training sets improving performance.

  • Collaborate with ML Ops and Data Science Engineers to deploy datasets, truthsets, models, training and inference code to cloud based model registry 

  • Effectively communicate and present deliverables and solutions to teams, stake holders, and executives. 

Who You Are: 

  • Candidate for an advanced degree (MS/PhD) in Computer Science, Statistics, Mathematics, Linguistics, Engineering or data related quantitative field

  • Specialization in natural language processing, computer vision, deep learning, machine learning, or related software development

  • Experience understanding and implementing published models and methods for practical application and real-world problems

  • Strong proficiency in Python and related CV and NLP tools and libraries, and familiarity with deep learning frameworks like Pytorch, TensorFlow, Keras, SciPy stack and Scikit-learn

Nice to Have: 

  • Experience with NLP techniques such as named entity recognition, relationship extraction, document classification, document summarization, topic modeling, machine translation, sentiment analysis, dialogue systems

  • Experience in document image processing i.e., computer vision methods, image classification, object detection, segmentation, layout analysis, redaction, handwriting recognition

  • Familiarity with NLP technologies such as, NLTK, spaCy, pandas, numpy, along with understanding of pre-trained language models and architectures like BERT, GPT, T5, XLNet, PL Marker, TP Linker, OneRel, Huggingface and OpenAI models, etc.


Internship Program Details:

  • Students must be enrolled in an accredited U.S. educational institution with a graduation date after August 2023. 

  • Summer 2023 program dates are May 15 – September 8 (Please note that we will have three intern onboarding dates to choose from: May 15th, June 5th and June 20th. Students may offboard every Friday, beginning August 11th. All internships must be wrapped up by September 8th.)

  • FULLY PAID temporary housing and travel to and from internship

  • All summer internships will be in Lehi, Utah. You will work a combined hybrid and office-based schedule that allows you to choose which days you come into the office and which days you work from temporary housing/home (Utah students).

  • Interns have the opportunity to network and partner with other interns and industry-leading professionals

  • You will participate in engaging events including executive speaker sessions, professional development, and our annual Intern Days to showcase your project and work. 

  • Full-time schedule (40 hours/week) required; Monday-Friday

  • Company-issued laptop and equipment provided for the duration of the internship program

  • Our interns enjoy mentorship and experience challenging work while receiving a great compensation package, temporary housing, and having a fun captivating experience—we have it all. Oh, and did we mention the possibility of full-time employment once you graduate?

Additional Information:

Ancestry is an Equal Opportunity Employer that makes employment decisions without regard to race, color, religious creed, national origin, ancestry, sex, pregnancy, sexual orientation, gender, gender identity, gender expression, age, mental or physical disability, medical condition, military or veteran status, citizenship, marital status, genetic information, or any other characteristic protected by applicable law. In addition, Ancestry will provide reasonable accommodations for qualified individuals with disabilities.

All job offers are contingent on a background check screen that complies with applicable law.  For San Francisco office candidates, pursuant to the San Francisco Fair Chance Ordinance, Ancestry will consider for employment qualified applicants with arrest and conviction records.  

Ancestry is not accepting unsolicited assistance from search firms for this employment opportunity. All resumes submitted by search firms to any employee at Ancestry via-email, the Internet or in any form and/or method without a valid written search agreement in place for this position will be deemed the sole property of Ancestry. No fee will be paid in the event the candidate is hired by Ancestry as a result of the referral or through other means.

Apply Now!

Not You?

Thank you