A Data Science Central Community

We are interested here in factoring numbers that are a product of two very large primes. Such numbers are used by encryption algorithms such as RSA, and the prime factors represent the keys (public and private) of the encryption code. Here you will also learn how data science techniques are applied to big data, including visualization, to derive insights. This article is good reading for the data scientist in training, who might not necessarily have easy access to interesting data: here the dataset is the set of all real numbers -- not just the integers -- and it is readily available to anyone. Much of the analysis performed here is statistical in nature, and thus, of particular interest to data scientists.

Factoring numbers that are a product of two large primes allows you to test the strength (or weakness) of these encryption keys. It is believed that if the prime numbers in question are a few hundred binary digits long, factoring is nearly impossible: it would require years of computing power on distributed systems, to factor just one of these numbers.

While the vast majority of big numbers have some small factors and are thus easier to break, the integers that we are dealing with here are difficult to handle, by design. Factoring a product of two large primes is considered to be an intractable computer science problem. Here, we use techniques of big data, statistics, and machine learning - in short a data science approach - to hopefully discover new efficient factoring techniques for these massive numbers. Our innovative approach is quite different from that used in traditional algorithms.

We have investigated this problem in the past already (click here for references) but here we propose a different, more general framework, offering multiple opportunities to jump-start new research paths on this topic.

Click here to read the full article. This long, rather technical article, featuring a brand new framework for factoring massive numbers, has the following content:

**1. General Framework**

- Fixed Point Algorithm
- Fundamental Result
- Simplified Fixed Point Algorithm

**2. Case Study**

- Implementation details
- Conclusion
- Deeper Analysis

**3. Data, Code, Visualizations**

**4. Improving the Algorithm**

- Strategies to improve efficiency
- Other articles by the same author

**5. Source Code**

© 2020 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge