PSYC 859 Syllabus

Published

March 12, 2026

Course Description

This graduate course is intended to provide an applied introduction to data management and data visualization in the social sciences. In order to take full advantage of modern statistical methods (e.g., structural equation models), competency in data management, semi-automated processing, and data wrangling is prerequisite. Likewise, prior to employing inferential statistics, exploratory visualization and analysis is essential to facilitate data cleaning and to form an initial understanding of patterns in the data. This course will cover both the principles and practice of data management, visualization, and exploratory analysis for summarizing quantitative data. In addition, students will learn data science skills to manage and visualize “big data,” where the size or complexity of the dataset defies traditional techniques.

Applications of data management, visualization, and analysis will use the R statistical programming language. R is quickly becoming the lingua franca in data science across disciplines and offers unparalleled tools for data analysis and visualization.

Learning Objectives

  • Design and implement reproducible data workflows for managing, cleaning, and documenting complex datasets using modern R-based tools.
  • Use exploratory data analysis and visualization as tools for scientific reasoning, including identifying patterns, anomalies, uncertainty, and data limitations.
  • Apply evidence-based principles of graphical perception and design to create clear, accurate, and interpretable quantitative graphics.
  • Critically evaluate and redesign data visualizations in scientific research and public communication, justifying design choices and trade-offs.
  • Communicate substantive insights through publication-quality and presentation-ready visualizations, integrating narrative, annotation, and appropriate use of modern visualization technologies.

Prerequisites

PSY 830 (Statistical Methods in Psychology I) or graduate course equivalent.

Students are expected to have a basic understanding of R or another high-level programming language (e.g., Python). Minimally, students should understand basic principles of computer programming, including:

  1. conditional logic (if/else) and logical operators (e.g., equality)
  2. basic data types (in R, vectors, lists, data.frames, matrices, and arrays)
  3. flow control (for/while loops, next, break)
  4. import and export of data from text files
  5. subsetting data using basic R syntax such as x[1:10, c(1,5)].

Required Textbooks

  • Tufte, E. R. (2001). The visual display of quantitative information (2nd ed.). Cheshire, CT: Graphics Press.
  • Wickham, H. (2025). ggplot2: Elegant graphics for data analysis (3rd ed.). New York: Springer. Please use the online (not-yet-published) version here: https://ggplot2-book.org.
  • Wickham, H., Cetinka-Rundel, M., & Grolemund, G. (2023). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (2nd edition). O’Reilly Media. Available online: https://r4ds.hadley.nz.

Required Software

Class Structure

With a few exceptions, class will be structured into three blocks as follows:

  • 9:00-10:10am Figure critique, lecture, and discussion of readings
  • 10:10am-10:20am Break
  • 10:20am-11:30am R demonstration and practical exercise

During the R demonstration, we will work on a data-related project together, so please bring a laptop with the above software and packages loaded. Let me know if this will be a difficulty for you so that we can arrange alternative plans.

Evaluation

Although the course will review the principles of effective data visualization (e.g., graphical perception), the course is primarily intended to facilitate your applied skills managing and visualizing data. Consequently, there will be no formal exams or quizzes. Instead, your grade will be based primarily on figure critiques, take-home exercises, participation, and three projects. Students are encouraged to bring a dataset that is relevant to their research for use in each project. If possible, a challenging dataset (one that defies simple management in a spreadsheet format) will provide better opportunities to learn advanced data management and visualization skills.

All projects are to be completed individually. Although you are encouraged to discuss data management and visualization challenges with your classmates, you’ll get the most benefit from the course by developing projects yourself.

  • 15% Participation, as defined by attending class, contributing to reading discussions, engagement with lab exercises, and otherwise contributing to scholarly discourse.

  • 10% Figure critiques and data exercises. In the first part of the semester, students will complete take-home data exercises to become familiar with managing, tidying, and wrangling data in R. In the latter half of the semester, students will be expected to submit a critique of at least one figure or table. The critique can be brief, perhaps in bullet form, but should highlight key strengths and limitations of the display, as well as suggestions for alternative visualizations. One (or perhaps a few) figures will be discussed at the beginning of class before the discussion of readings.

    Assignments will be due by 8am on Wednesdays to provide time to review them and incorporate into class discussion.

  • 5% Data quality assurance and processing proposal. Due: 1/29 (Week 4)

  • 15% Data quality assurance project (code and output). Due: 2/12 (Week 6)

  • 15% Conceptual figure/infographic project. Due: 3/5 (Week 9)

  • 5% Final project proposal. Due: 3/26 (Week 11)

  • 10% Final presentation of data visualization project. Due: 4/23 (Week 14)

  • 25% Data visualization final product. Due: 4/27

Schedule

Week 1 (1/8): Introduction to data management and tidy data

Conceptual readings

Practical readings

Week 2 (1/15): Data aggregation, manipulation, joins

Practical readings

Week 3 (1/22): Data processing and quality assurance, custom functions, basics of automation

Conceptual readings

Practical readings

Week 4 (1/29): Advanced data manipulation and management, tracking work in R markdown

Conceptual readings

Practical readings

Week 5 (2/5): Principles of data visualization and graphical grammar

Conceptual readings

Practical readings

Week 6 (2/12): Visual and graphical perception

Conceptual readings

Week 7 (2/19): Graphic design, layout, style, use of color

Conceptual readings

Practical readings

Optional readings

Week 8 (2/26): A tour of quantitative visualization

Conceptual readings

  • Tufte, The visual display of quantitative information, Chs. 6-8.
  • Heer, J., Bostock, M., & Ogievetsky, V. (2010). A Tour through the Visualization Zoo. Communications of the ACM, 53(6), 59-67.

Practical readings

Week 9 (3/5): Visualizing continuous data (in ggplot2)

Conceptual readings

  • Tufte, The visual display of quantitative information, Ch. 9.
  • Cleveland, W. S. (1993). Visualizing Data, Chs. 1-2.

Practical readings

Week 10 (3/12): Visualizing count and categorical data (in ggplot2)

Practical readings

3/19: No class (Spring break)

Week 11 (3/26): Maximizing clarity: preparing graphics for presentation and publication

Conceptual readings

Practical readings

Optional readings

4/2: No class (Well-being day)

Week 12 (4/9): Visualizing and understanding fit (and misfit) of statistical models

Week 13 (4/16): Exploratory statistics for understanding data: clustering, multidimensional scaling, dimension reduction

Week 14 (4/23): Final presentations of data projects

Class Attendance

You are advised to attend all lectures because some material presented in lecture will not be in the readings. Additionally, the lectures will give you a sense of what to focus on in the readings and how to integrate information across topics.

University Policy: As stated in the University’s Class Attendance Policy, no right or privilege exists that permits a student to be absent from any class meetings, except for these University Approved Absences:

Use of AI Tools (e.g., Large Language Models)

Recent advances in large language models (LLMs) and AI-assisted coding tools (e.g., ChatGPT, GitHub Copilot) have made them highly effective aids for data wrangling, visualization, and programming in R. In professional research and applied data science settings, such tools are increasingly used to accelerate development, diagnose bugs, and explore alternative implementations.

At the same time, a central goal of this course is for students to develop their competency in data management, exploratory analysis, and visualization. Becoming proficient in these areas will allow students to implement these skills independently and to critically evaluate code written by collaborators or generated by AI tools.

Permitted uses

Students may use AI tools for the following purposes:

  • Debugging code, including identifying syntax errors, logical errors, or unexpected behavior.
  • Requesting explanations of how a piece of code works, why it produces a particular result, or why it fails.
  • Requesting suggestions for code improvement, refactoring, or alternative approaches to a task.

These uses are consistent with how AI tools are used responsibly in real research workflows.

Expectations and responsibilities

When using AI tools, students are expected to:

  • Actively evaluate and understand any code they submit, regardless of its source.
  • Ensure they can explain what the code does and why it works, including key functions, assumptions, and consequences.
  • Make independent decisions about whether to adopt AI-suggested code, rather than copying it uncritically.
  • Remain responsible for correctness, clarity, and reproducibility of all submitted work.

Submitting code that the student does not understand is inconsistent with the learning objectives of the course.

Prohibited uses

The following are not permitted:

  • Submitting AI-generated code or analyses without understanding or review.
  • Using AI tools as a substitute for engaging with core course concepts (e.g., visualization principles, data cleaning logic).
  • Representing AI-generated work as understanding or reasoning that the student cannot demonstrate if asked.

Transparency

For major assignments and the final project, students may be asked to include a brief AI use statement (1-3 sentences) describing whether and how AI tools were used (e.g., “used for debugging,” “used to explore alternative ggplot layouts”). This is not punitive; its purpose is to promote transparency and reflective practice.

Laptops and mobile devices; video/audiotaping

You are encouraged to bring a laptop to class for course-related use during the lecture and practical part of each meeting. Please, however, ensure that laptops and mobile devices are silent during class. In addition, please refrain from texting, checking social media, or otherwise dividing your attention with personal matters. If you would like to audio or videotape any of the lectures, please obtain the instructor’s permission first.

Equal Opportunity and Compliance - Accommodations

Equal Opportunity and Compliance Accommodations Team (Accommodations - UNC Equal Opportunity and Compliance) receives requests for accommodations for disability, pregnancy and related conditions, and sincerely held religious beliefs and practices through the University’s Policy on Accommodations. EOC Accommodations team determines eligibility and reasonable accommodations consistent with state and federal laws.

Counseling and Psychological Services (CAPS)

UNC-Chapel Hill is strongly committed to addressing the mental health needs of a diverse student body. The Heels Care Network website is a place to access the many mental health resources at Carolina. CAPS is the primary mental health provider for students, offering timely access to consultation and connection to clinically appropriate services. Go to the CAPS website or visit their facilities on the third floor of the Campus Health building for an initial evaluation to learn more. Students can also call CAPS 24/7 at 919-966-3658 for immediate assistance.