ml_model package¶
Submodules¶
ml_model.features module¶
Most rules are described in ‘Secrets in Source Code: Reducing False Positives Using Machine Learning’.
- class credsweeper.ml_model.features.FileExtension(extensions)[source]¶
Bases:
credsweeper.ml_model.features.FeatureCategorical feature of file type.
- class credsweeper.ml_model.features.HartleyEntropy(base, norm=False)[source]¶
Bases:
credsweeper.ml_model.features.RenyiEntropyHartley entropy feature.
- class credsweeper.ml_model.features.HasHtmlTag[source]¶
Bases:
credsweeper.ml_model.features.FeatureFeature is true if line has HTML tags (HTML file).
- class credsweeper.ml_model.features.IsSecretNumeric[source]¶
Bases:
credsweeper.ml_model.features.FeatureFeature is true if candidate value is a numerical value.
- class credsweeper.ml_model.features.PossibleComment[source]¶
Bases:
credsweeper.ml_model.features.FeatureFeature is true if candidate line starts with #,*,/*? (Possible comment).
- class credsweeper.ml_model.features.RenyiEntropy(base, alpha, norm=False)[source]¶
Bases:
credsweeper.ml_model.features.FeatureRenyi entropy.
See next link for details: https://digitalassets.lib.berkeley.edu/math/ucb/text/math_s4_v1_article-27.pdf
- Parameters
CHARS – Number base
alpha (
float) – entropy parameternorm – set True to normalize output probabilities
- CHARS = {'base36': 'abcdefghijklmnopqrstuvwxyz1234567890', 'base64': 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=', 'hex': '1234567890abcdefABCDEF'}¶
- estimate_entropy(p_x)[source]¶
Calculate Renyi entropy of ‘p_x’ sequence.
Function is based on definition of Renyi entropy for arbitrary probability distribution. Please see next link for details: https://digitalassets.lib.berkeley.edu/math/ucb/text/math_s4_v1_article-27.pdf
- Return type
- class credsweeper.ml_model.features.RuleName(rule_names)[source]¶
Bases:
credsweeper.ml_model.features.FeatureCategorical feature that corresponds to rule name.
- class credsweeper.ml_model.features.ShannonEntropy(base, norm=False)[source]¶
Bases:
credsweeper.ml_model.features.RenyiEntropyShannon entropy feature.
- class credsweeper.ml_model.features.WordInLine(words)[source]¶
Bases:
credsweeper.ml_model.features.FeatureFeature is true if line contains at least one word from predefined list.
- class credsweeper.ml_model.features.WordInPath(words)[source]¶
Bases:
credsweeper.ml_model.features.FeatureFeature is true if candidate path contains at least one word from predefined list.
ml_model.ml_validator module¶
- class credsweeper.ml_model.ml_validator.MlValidator[source]¶
Bases:
object- classmethod extract_common_features(candidates)[source]¶
Extract features that are guaranteed to be the same for all candidates on the same line with same value.
- Return type
- classmethod extract_unique_features(candidates)[source]¶
Extract features that can by different between candidates. Join them with or operator.
- Return type