TEACHING @ PENN STATE
DS300.002 - PRIVACY IN DATA SCIENCES
(FALL 2021; FALL 2022)
This course provides an overview of the data privacy and security implications of data analytics projects. Students will learn design principles to enhance privacy and security. The course also imparts knowledge on the processes and tools used to secure data and protect privacy in distributed, large scale data sets. Students will learn about privacy and security management approaches and how economic incentives drive the protection of data. From a design perspective, the course will address how privacy-preserving statistical databases are built, and which cryptographic primitives can be applied to analytics projects to protect data. The material will address how those concepts can be utilized in specific application contexts such as data mining and recommender systems. The course further provides an overview of system security approaches for data protection, and how to provide effective access control.
IST 597.001 - FOUNDATIONS OF DATA PRIVACY
(SPRING 2020; SPRING 2021; SPRING 2022; SPRING 2023)
This course covers all aspects of data privacy, including the mathematical and computational foundations of data privacy, as well as principles and methods for privacy-preserving pattern discovery and predictive modeling using data. We particularly emphasize methods for data containing personally identifiable information and/or information sensitive to individuals and organizations. Specific topics to be covered in the course include economics of privacy, anonymization methods, tokenization, privacy preserving data mining and differential privacy. Implementation of privacy-preserving methods will be considered across various data types, including but not limited to graph and transactional data. The objective of the course is to provide students with a comprehensive understanding of the foundations of data privacy as well as necessary skills for knowledge discovery.
IST 597.004 - REPRODUCIBILITY IN THE DATA-DRIVEN SCIENCES
This course reviews fundamental topics around the reproducibility crisis - the recent revelation that wide swaths of published science cannot be reproduced - with specific focus on methodological concerns in computational and big data science. We take a multidisciplinary approach to this material, emphasizing historical and ethical context, as well as specific technical challenges and emerging methods to address these challenges. The course is designed in a seminar fashion, with discussion and critical analysis of assigned readings. The course also includes hands-on, data-driven projects, replication of existing findings and preregistration of a study of the students' choosing.
IST 230 - LANGUAGE, LOGIC AND DISCRETE MATHEMATICS
(FALL 2018 - SUMMER 2023; Course chair)
IST 230 is one of the five introductory core courses for the baccalaureate degree program in Information Sciences and Technology. The purpose of IST 230 is to provide students with an understanding of an array of mathematical concepts and methods which form the foundation of modern information science, in a form that will be relevant and useful for IST students. The course draws material from several mathematical disciplines: formal language theory, mathematical logic, discrete mathematics.