Abstract by Chandler Fischbeck
Creating and Using a Denumerated Data Set
One of the consequences of the data analytics craze is that many organizations have become data hungry, buying or eliciting data from many sources. As a consequence, some data sets contain personal data that are viewed as sensitive or private by many individuals. This seeming digital invasion of privacy has prompted efforts to obfuscate individual data. The “Denumeration” algorithm is one such attempt to obfuscate data in a manner in which all individual information is removed but much or most of the aggregate information is retained for public use. This paper reviews the Denumeration algorithm and illustrates its use in comparing health outcomes of two cohorts using a synthetic matching algorithm based on denumerated data. Results are based on simulated cohorts and compared with results from complete, individual specific data.