US Army Cyber Command and affiliated units
1. Perform hands-on data analysis and modeling with large and disparate data sets in order to identify trends such as malicious behavior, unauthorized users, etc.
2. Find and communicate trends and stories within data to senior leadership in order to inform executive decision making.
3. Develop, configure, and maintain algorithms that aggregate, correlate, and report on very large data sets derived from disparate sensing and logging capabilities.
4. Work with distributed storage and compute environments that support parallel processing.
5. Implement analytics that focus on cyber security discovery and detection methodologies.
6. Apply data mining, NLP, and machine learning (both supervised and unsupervised) to achieve organizational objectives and goals.
7. Work side-by-side with mission leaders, software engineers, and operators in designing experiments and minimum viable products.
8. Discover data sources, get access to them, import them, clean them up, and make them "model-ready". You need to be willing and able to do your own ETL.
9. Create and refine features from the underlying data. You'll enjoy developing just enough subject matter expertise to have an intuition about what features might make your model perform better, and then you'll lather, rinse and repeat.
10. Run regular A/B tests, gather data, perform statistical analysis, draw conclusions on the impact of your optimizations and communicate results to peers and leaders.
11. Qualifications:
a. 1+ years of experience or relevant degree in data science.
b. BS, MS, or PhD in an appropriate technology field (Computer Science, Statistics, Applied Math, Operations Research, or a related field).
c. Expertise in modern advanced analytical tools and programming languages such as R or Python with scikit learn, numpy, scipy, etc.
d. Fluent in SQL, Hive, SparkSQL, etc.
e. Expertise in data mining algorithms and statistical modeling techniques such as clustering, classification, regression, decision trees, neural nets, support vector machines, genetic algorithms, anomaly detection, recommender systems, sequential pattern discovery, and text mining.
f. Knowledge of cyber security operations, cyber security data types, cyber security threats, and network protocols (SOC tools and methodologies).
g. Strong mathematical background (linear algebra, calculus, probability, and statistics).
h. Solid communication skills: Demonstrated ability to explain complex technical issues to both technical and non-technical audiences, and experience using disparate data sources to tell a cohesive story to disparate audiences.