Sklearn 'Seed' Not Working Properly In a Section of Code [closed] The 2019 Stack Overflow Developer Survey Results Are InPosterior covariance of Normal-Inverse-Wishart not converging properlyLogistic Regression not quite workingWhy is Python's scikit-learn LDA not working correctly and how does it compute LDA via SVD?K-Means Clustering Not Working As ExpcectedEmploying cross_validation to to develop a reasonable linear regression model using scikit learnWhy does sklearn Ridge not accept warm start?Working between sklearn and scipy for convex optimizationPCA principal components in sklearn not matching eigen-vectors of covariance calculated by numpySklearn BaggingRegressor does not work with LightGBMRegressor & MAE objective

What is the meaning of the verb "bear" in this context?

How to type this arrow in math mode?

slides for 30min~1hr skype tenure track application interview

Can we generate random numbers using irrational numbers like π and e?

Reference request: Oldest number theory books with (unsolved) exercises?

Have you ever entered Singapore using a different passport or name?

Deal with toxic manager when you can't quit

Origin of "cooter" meaning "vagina"

Did Scotland spend $250,000 for the slogan "Welcome to Scotland"?

"as much details as you can remember"

Are there any other methods to apply to solving simultaneous equations?

Loose spokes after only a few rides

When should I buy a clipper card after flying to OAK?

Can a rogue use sneak attack with weapons that have the thrown property even if they are not thrown?

What is the accessibility of a package's `Private` context variables?

Why isn't the circumferential light around the M87 black hole's event horizon symmetric?

Interpreting the 2019 New York Reproductive Health Act?

Aging parents with no investments

If a Druid sees an animal’s corpse, can they Wild Shape into that animal?

If I score a critical hit on an 18 or higher, what are my chances of getting a critical hit if I roll 3d20?

Is this app Icon Browser Safe/Legit?

Who coined the term "madman theory"?

Identify boardgame from Big movie

Why didn't the Event Horizon Telescope team mention Sagittarius A*?



Sklearn 'Seed' Not Working Properly In a Section of Code [closed]



The 2019 Stack Overflow Developer Survey Results Are InPosterior covariance of Normal-Inverse-Wishart not converging properlyLogistic Regression not quite workingWhy is Python's scikit-learn LDA not working correctly and how does it compute LDA via SVD?K-Means Clustering Not Working As ExpcectedEmploying cross_validation to to develop a reasonable linear regression model using scikit learnWhy does sklearn Ridge not accept warm start?Working between sklearn and scipy for convex optimizationPCA principal components in sklearn not matching eigen-vectors of covariance calculated by numpySklearn BaggingRegressor does not work with LightGBMRegressor & MAE objective



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0












$begingroup$


I have written an ensemble using Scikit Learn VotingClassifier.



I have set a seed in the cross validation section. However, it does not appear to 'hold'. Meaning, If I re-run the code block I get different results. (I can only assume each run of the code block is dividing the dataset into folds with different constituents instead of 'freezing' the random state.



Here is the code:



#Voting Ensemble of Classification
#Create Submodels
num_folds = 10
seed =7
kfold = KFold(n_splits=num_folds, random_state=seed)
estimators = []
model1 =LogisticRegression()
estimators.append(('LR',model1))
model2 = KNeighborsClassifier()
estimators.append(('KNN',model2))
model3 = GradientBoostingClassifier()
estimators.append(('GBM',model3))
#Create the ensemble
ensemble = VotingClassifier(estimators,voting='soft')
results = cross_val_score(ensemble, X_train, Y_train,cv=kfold)
print(results)


The results printed are the results of the 10 CV fold training. If I run this code block several times I get the following results:



1:



[0.70588235 0.94117647 1. 0.82352941 0.94117647 0.88235294
0.8125 0.875 0.8125 0.9375 ]


2:



[0.76470588 0.94117647 1. 0.82352941 0.94117647 0.88235294
0.8125 0.875 0.8125 0.875 ]


3:



[0.76470588 0.94117647 1. 0.82352941 0.94117647 0.88235294
0.8125 0.875 0.8125 0.875 ]


4:



[0.76470588 0.94117647 1. 0.82352941 1. 0.88235294
0.8125 0.875 0.625 0.875 ]


So it appears my random_state=seed isn't holding.



What is incorrect?



Thanks in advance.










share|cite|improve this question









$endgroup$



closed as off-topic by jbowman, Sycorax, Robert Long, Michael Chernick, mdewey Mar 24 at 11:30


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Sycorax, Robert Long, Michael Chernick, mdewey
If this question can be reworded to fit the rules in the help center, please edit the question.






















    0












    $begingroup$


    I have written an ensemble using Scikit Learn VotingClassifier.



    I have set a seed in the cross validation section. However, it does not appear to 'hold'. Meaning, If I re-run the code block I get different results. (I can only assume each run of the code block is dividing the dataset into folds with different constituents instead of 'freezing' the random state.



    Here is the code:



    #Voting Ensemble of Classification
    #Create Submodels
    num_folds = 10
    seed =7
    kfold = KFold(n_splits=num_folds, random_state=seed)
    estimators = []
    model1 =LogisticRegression()
    estimators.append(('LR',model1))
    model2 = KNeighborsClassifier()
    estimators.append(('KNN',model2))
    model3 = GradientBoostingClassifier()
    estimators.append(('GBM',model3))
    #Create the ensemble
    ensemble = VotingClassifier(estimators,voting='soft')
    results = cross_val_score(ensemble, X_train, Y_train,cv=kfold)
    print(results)


    The results printed are the results of the 10 CV fold training. If I run this code block several times I get the following results:



    1:



    [0.70588235 0.94117647 1. 0.82352941 0.94117647 0.88235294
    0.8125 0.875 0.8125 0.9375 ]


    2:



    [0.76470588 0.94117647 1. 0.82352941 0.94117647 0.88235294
    0.8125 0.875 0.8125 0.875 ]


    3:



    [0.76470588 0.94117647 1. 0.82352941 0.94117647 0.88235294
    0.8125 0.875 0.8125 0.875 ]


    4:



    [0.76470588 0.94117647 1. 0.82352941 1. 0.88235294
    0.8125 0.875 0.625 0.875 ]


    So it appears my random_state=seed isn't holding.



    What is incorrect?



    Thanks in advance.










    share|cite|improve this question









    $endgroup$



    closed as off-topic by jbowman, Sycorax, Robert Long, Michael Chernick, mdewey Mar 24 at 11:30


    This question appears to be off-topic. The users who voted to close gave this specific reason:


    • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Sycorax, Robert Long, Michael Chernick, mdewey
    If this question can be reworded to fit the rules in the help center, please edit the question.


















      0












      0








      0





      $begingroup$


      I have written an ensemble using Scikit Learn VotingClassifier.



      I have set a seed in the cross validation section. However, it does not appear to 'hold'. Meaning, If I re-run the code block I get different results. (I can only assume each run of the code block is dividing the dataset into folds with different constituents instead of 'freezing' the random state.



      Here is the code:



      #Voting Ensemble of Classification
      #Create Submodels
      num_folds = 10
      seed =7
      kfold = KFold(n_splits=num_folds, random_state=seed)
      estimators = []
      model1 =LogisticRegression()
      estimators.append(('LR',model1))
      model2 = KNeighborsClassifier()
      estimators.append(('KNN',model2))
      model3 = GradientBoostingClassifier()
      estimators.append(('GBM',model3))
      #Create the ensemble
      ensemble = VotingClassifier(estimators,voting='soft')
      results = cross_val_score(ensemble, X_train, Y_train,cv=kfold)
      print(results)


      The results printed are the results of the 10 CV fold training. If I run this code block several times I get the following results:



      1:



      [0.70588235 0.94117647 1. 0.82352941 0.94117647 0.88235294
      0.8125 0.875 0.8125 0.9375 ]


      2:



      [0.76470588 0.94117647 1. 0.82352941 0.94117647 0.88235294
      0.8125 0.875 0.8125 0.875 ]


      3:



      [0.76470588 0.94117647 1. 0.82352941 0.94117647 0.88235294
      0.8125 0.875 0.8125 0.875 ]


      4:



      [0.76470588 0.94117647 1. 0.82352941 1. 0.88235294
      0.8125 0.875 0.625 0.875 ]


      So it appears my random_state=seed isn't holding.



      What is incorrect?



      Thanks in advance.










      share|cite|improve this question









      $endgroup$




      I have written an ensemble using Scikit Learn VotingClassifier.



      I have set a seed in the cross validation section. However, it does not appear to 'hold'. Meaning, If I re-run the code block I get different results. (I can only assume each run of the code block is dividing the dataset into folds with different constituents instead of 'freezing' the random state.



      Here is the code:



      #Voting Ensemble of Classification
      #Create Submodels
      num_folds = 10
      seed =7
      kfold = KFold(n_splits=num_folds, random_state=seed)
      estimators = []
      model1 =LogisticRegression()
      estimators.append(('LR',model1))
      model2 = KNeighborsClassifier()
      estimators.append(('KNN',model2))
      model3 = GradientBoostingClassifier()
      estimators.append(('GBM',model3))
      #Create the ensemble
      ensemble = VotingClassifier(estimators,voting='soft')
      results = cross_val_score(ensemble, X_train, Y_train,cv=kfold)
      print(results)


      The results printed are the results of the 10 CV fold training. If I run this code block several times I get the following results:



      1:



      [0.70588235 0.94117647 1. 0.82352941 0.94117647 0.88235294
      0.8125 0.875 0.8125 0.9375 ]


      2:



      [0.76470588 0.94117647 1. 0.82352941 0.94117647 0.88235294
      0.8125 0.875 0.8125 0.875 ]


      3:



      [0.76470588 0.94117647 1. 0.82352941 0.94117647 0.88235294
      0.8125 0.875 0.8125 0.875 ]


      4:



      [0.76470588 0.94117647 1. 0.82352941 1. 0.88235294
      0.8125 0.875 0.625 0.875 ]


      So it appears my random_state=seed isn't holding.



      What is incorrect?



      Thanks in advance.







      python scikit-learn ensemble






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked Mar 23 at 15:13









      Windstorm1981Windstorm1981

      1466




      1466




      closed as off-topic by jbowman, Sycorax, Robert Long, Michael Chernick, mdewey Mar 24 at 11:30


      This question appears to be off-topic. The users who voted to close gave this specific reason:


      • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Sycorax, Robert Long, Michael Chernick, mdewey
      If this question can be reworded to fit the rules in the help center, please edit the question.







      closed as off-topic by jbowman, Sycorax, Robert Long, Michael Chernick, mdewey Mar 24 at 11:30


      This question appears to be off-topic. The users who voted to close gave this specific reason:


      • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Sycorax, Robert Long, Michael Chernick, mdewey
      If this question can be reworded to fit the rules in the help center, please edit the question.




















          1 Answer
          1






          active

          oldest

          votes


















          2












          $begingroup$

          Random seed of models (LogisticRegression, GradientBoostingClassifier) needs to be fixed too, so that their random behavior becomes reproducible. Here is a working example that produces the same result over multiple runs:



          import sklearn
          from sklearn.model_selection import KFold, cross_val_score
          from sklearn.linear_model import LogisticRegression
          from sklearn.neighbors import KNeighborsClassifier
          from sklearn.ensemble import GradientBoostingClassifier, VotingClassifier
          import numpy as np

          #Voting Ensemble of Classification
          #Create Submodels
          num_folds = 10
          seed =7

          # Data
          np.random.seed(seed)
          feature_1 = np.random.normal(0, 2, 10000)
          feature_2 = np.random.normal(5, 6, 10000)
          X_train = np.vstack([feature_1, feature_2]).T
          Y_train = np.random.randint(0, 2, 10000).T

          kfold = KFold(n_splits=num_folds, random_state=seed)
          estimators = []
          model1 =LogisticRegression(random_state=seed)
          estimators.append(('LR',model1))
          model2 = KNeighborsClassifier()
          estimators.append(('KNN',model2))
          model3 = GradientBoostingClassifier(random_state=seed)
          estimators.append(('GBM',model3))
          #Create the ensemble
          ensemble = VotingClassifier(estimators,voting='soft')
          results = cross_val_score(ensemble, X_train, Y_train,cv=kfold)
          print('sklearn version', sklearn.__version__)
          print(results)


          Output:



          sklearn version 0.19.1
          [0.502 0.496 0.483 0.513 0.515 0.508 0.517 0.499 0.515 0.504]





          share|cite|improve this answer











          $endgroup$












          • $begingroup$
            Thanks for your quick reply. Not sure I follow completely. random_state=seed fixes my cross validation. I note your line np.random.seed(seed). Intuitively it suggests to me it is ensuring repeatable generation of toy data. I already have a data set. How does that apply to 'fixing seed of models'?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:34











          • $begingroup$
            @Windstorm1981 My bad. Updated.
            $endgroup$
            – Esmailian
            Mar 23 at 17:44






          • 1




            $begingroup$
            ha! Clear now. So fixing the cv fixes the data splits. Fixing the models fixes how the models handle the (fixed) data splits?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:46






          • 1




            $begingroup$
            @Windstorm1981 Exactly!
            $endgroup$
            – Esmailian
            Mar 23 at 17:47

















          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2












          $begingroup$

          Random seed of models (LogisticRegression, GradientBoostingClassifier) needs to be fixed too, so that their random behavior becomes reproducible. Here is a working example that produces the same result over multiple runs:



          import sklearn
          from sklearn.model_selection import KFold, cross_val_score
          from sklearn.linear_model import LogisticRegression
          from sklearn.neighbors import KNeighborsClassifier
          from sklearn.ensemble import GradientBoostingClassifier, VotingClassifier
          import numpy as np

          #Voting Ensemble of Classification
          #Create Submodels
          num_folds = 10
          seed =7

          # Data
          np.random.seed(seed)
          feature_1 = np.random.normal(0, 2, 10000)
          feature_2 = np.random.normal(5, 6, 10000)
          X_train = np.vstack([feature_1, feature_2]).T
          Y_train = np.random.randint(0, 2, 10000).T

          kfold = KFold(n_splits=num_folds, random_state=seed)
          estimators = []
          model1 =LogisticRegression(random_state=seed)
          estimators.append(('LR',model1))
          model2 = KNeighborsClassifier()
          estimators.append(('KNN',model2))
          model3 = GradientBoostingClassifier(random_state=seed)
          estimators.append(('GBM',model3))
          #Create the ensemble
          ensemble = VotingClassifier(estimators,voting='soft')
          results = cross_val_score(ensemble, X_train, Y_train,cv=kfold)
          print('sklearn version', sklearn.__version__)
          print(results)


          Output:



          sklearn version 0.19.1
          [0.502 0.496 0.483 0.513 0.515 0.508 0.517 0.499 0.515 0.504]





          share|cite|improve this answer











          $endgroup$












          • $begingroup$
            Thanks for your quick reply. Not sure I follow completely. random_state=seed fixes my cross validation. I note your line np.random.seed(seed). Intuitively it suggests to me it is ensuring repeatable generation of toy data. I already have a data set. How does that apply to 'fixing seed of models'?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:34











          • $begingroup$
            @Windstorm1981 My bad. Updated.
            $endgroup$
            – Esmailian
            Mar 23 at 17:44






          • 1




            $begingroup$
            ha! Clear now. So fixing the cv fixes the data splits. Fixing the models fixes how the models handle the (fixed) data splits?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:46






          • 1




            $begingroup$
            @Windstorm1981 Exactly!
            $endgroup$
            – Esmailian
            Mar 23 at 17:47















          2












          $begingroup$

          Random seed of models (LogisticRegression, GradientBoostingClassifier) needs to be fixed too, so that their random behavior becomes reproducible. Here is a working example that produces the same result over multiple runs:



          import sklearn
          from sklearn.model_selection import KFold, cross_val_score
          from sklearn.linear_model import LogisticRegression
          from sklearn.neighbors import KNeighborsClassifier
          from sklearn.ensemble import GradientBoostingClassifier, VotingClassifier
          import numpy as np

          #Voting Ensemble of Classification
          #Create Submodels
          num_folds = 10
          seed =7

          # Data
          np.random.seed(seed)
          feature_1 = np.random.normal(0, 2, 10000)
          feature_2 = np.random.normal(5, 6, 10000)
          X_train = np.vstack([feature_1, feature_2]).T
          Y_train = np.random.randint(0, 2, 10000).T

          kfold = KFold(n_splits=num_folds, random_state=seed)
          estimators = []
          model1 =LogisticRegression(random_state=seed)
          estimators.append(('LR',model1))
          model2 = KNeighborsClassifier()
          estimators.append(('KNN',model2))
          model3 = GradientBoostingClassifier(random_state=seed)
          estimators.append(('GBM',model3))
          #Create the ensemble
          ensemble = VotingClassifier(estimators,voting='soft')
          results = cross_val_score(ensemble, X_train, Y_train,cv=kfold)
          print('sklearn version', sklearn.__version__)
          print(results)


          Output:



          sklearn version 0.19.1
          [0.502 0.496 0.483 0.513 0.515 0.508 0.517 0.499 0.515 0.504]





          share|cite|improve this answer











          $endgroup$












          • $begingroup$
            Thanks for your quick reply. Not sure I follow completely. random_state=seed fixes my cross validation. I note your line np.random.seed(seed). Intuitively it suggests to me it is ensuring repeatable generation of toy data. I already have a data set. How does that apply to 'fixing seed of models'?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:34











          • $begingroup$
            @Windstorm1981 My bad. Updated.
            $endgroup$
            – Esmailian
            Mar 23 at 17:44






          • 1




            $begingroup$
            ha! Clear now. So fixing the cv fixes the data splits. Fixing the models fixes how the models handle the (fixed) data splits?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:46






          • 1




            $begingroup$
            @Windstorm1981 Exactly!
            $endgroup$
            – Esmailian
            Mar 23 at 17:47













          2












          2








          2





          $begingroup$

          Random seed of models (LogisticRegression, GradientBoostingClassifier) needs to be fixed too, so that their random behavior becomes reproducible. Here is a working example that produces the same result over multiple runs:



          import sklearn
          from sklearn.model_selection import KFold, cross_val_score
          from sklearn.linear_model import LogisticRegression
          from sklearn.neighbors import KNeighborsClassifier
          from sklearn.ensemble import GradientBoostingClassifier, VotingClassifier
          import numpy as np

          #Voting Ensemble of Classification
          #Create Submodels
          num_folds = 10
          seed =7

          # Data
          np.random.seed(seed)
          feature_1 = np.random.normal(0, 2, 10000)
          feature_2 = np.random.normal(5, 6, 10000)
          X_train = np.vstack([feature_1, feature_2]).T
          Y_train = np.random.randint(0, 2, 10000).T

          kfold = KFold(n_splits=num_folds, random_state=seed)
          estimators = []
          model1 =LogisticRegression(random_state=seed)
          estimators.append(('LR',model1))
          model2 = KNeighborsClassifier()
          estimators.append(('KNN',model2))
          model3 = GradientBoostingClassifier(random_state=seed)
          estimators.append(('GBM',model3))
          #Create the ensemble
          ensemble = VotingClassifier(estimators,voting='soft')
          results = cross_val_score(ensemble, X_train, Y_train,cv=kfold)
          print('sklearn version', sklearn.__version__)
          print(results)


          Output:



          sklearn version 0.19.1
          [0.502 0.496 0.483 0.513 0.515 0.508 0.517 0.499 0.515 0.504]





          share|cite|improve this answer











          $endgroup$



          Random seed of models (LogisticRegression, GradientBoostingClassifier) needs to be fixed too, so that their random behavior becomes reproducible. Here is a working example that produces the same result over multiple runs:



          import sklearn
          from sklearn.model_selection import KFold, cross_val_score
          from sklearn.linear_model import LogisticRegression
          from sklearn.neighbors import KNeighborsClassifier
          from sklearn.ensemble import GradientBoostingClassifier, VotingClassifier
          import numpy as np

          #Voting Ensemble of Classification
          #Create Submodels
          num_folds = 10
          seed =7

          # Data
          np.random.seed(seed)
          feature_1 = np.random.normal(0, 2, 10000)
          feature_2 = np.random.normal(5, 6, 10000)
          X_train = np.vstack([feature_1, feature_2]).T
          Y_train = np.random.randint(0, 2, 10000).T

          kfold = KFold(n_splits=num_folds, random_state=seed)
          estimators = []
          model1 =LogisticRegression(random_state=seed)
          estimators.append(('LR',model1))
          model2 = KNeighborsClassifier()
          estimators.append(('KNN',model2))
          model3 = GradientBoostingClassifier(random_state=seed)
          estimators.append(('GBM',model3))
          #Create the ensemble
          ensemble = VotingClassifier(estimators,voting='soft')
          results = cross_val_score(ensemble, X_train, Y_train,cv=kfold)
          print('sklearn version', sklearn.__version__)
          print(results)


          Output:



          sklearn version 0.19.1
          [0.502 0.496 0.483 0.513 0.515 0.508 0.517 0.499 0.515 0.504]






          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Mar 23 at 17:49

























          answered Mar 23 at 17:10









          EsmailianEsmailian

          42615




          42615











          • $begingroup$
            Thanks for your quick reply. Not sure I follow completely. random_state=seed fixes my cross validation. I note your line np.random.seed(seed). Intuitively it suggests to me it is ensuring repeatable generation of toy data. I already have a data set. How does that apply to 'fixing seed of models'?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:34











          • $begingroup$
            @Windstorm1981 My bad. Updated.
            $endgroup$
            – Esmailian
            Mar 23 at 17:44






          • 1




            $begingroup$
            ha! Clear now. So fixing the cv fixes the data splits. Fixing the models fixes how the models handle the (fixed) data splits?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:46






          • 1




            $begingroup$
            @Windstorm1981 Exactly!
            $endgroup$
            – Esmailian
            Mar 23 at 17:47
















          • $begingroup$
            Thanks for your quick reply. Not sure I follow completely. random_state=seed fixes my cross validation. I note your line np.random.seed(seed). Intuitively it suggests to me it is ensuring repeatable generation of toy data. I already have a data set. How does that apply to 'fixing seed of models'?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:34











          • $begingroup$
            @Windstorm1981 My bad. Updated.
            $endgroup$
            – Esmailian
            Mar 23 at 17:44






          • 1




            $begingroup$
            ha! Clear now. So fixing the cv fixes the data splits. Fixing the models fixes how the models handle the (fixed) data splits?
            $endgroup$
            – Windstorm1981
            Mar 23 at 17:46






          • 1




            $begingroup$
            @Windstorm1981 Exactly!
            $endgroup$
            – Esmailian
            Mar 23 at 17:47















          $begingroup$
          Thanks for your quick reply. Not sure I follow completely. random_state=seed fixes my cross validation. I note your line np.random.seed(seed). Intuitively it suggests to me it is ensuring repeatable generation of toy data. I already have a data set. How does that apply to 'fixing seed of models'?
          $endgroup$
          – Windstorm1981
          Mar 23 at 17:34





          $begingroup$
          Thanks for your quick reply. Not sure I follow completely. random_state=seed fixes my cross validation. I note your line np.random.seed(seed). Intuitively it suggests to me it is ensuring repeatable generation of toy data. I already have a data set. How does that apply to 'fixing seed of models'?
          $endgroup$
          – Windstorm1981
          Mar 23 at 17:34













          $begingroup$
          @Windstorm1981 My bad. Updated.
          $endgroup$
          – Esmailian
          Mar 23 at 17:44




          $begingroup$
          @Windstorm1981 My bad. Updated.
          $endgroup$
          – Esmailian
          Mar 23 at 17:44




          1




          1




          $begingroup$
          ha! Clear now. So fixing the cv fixes the data splits. Fixing the models fixes how the models handle the (fixed) data splits?
          $endgroup$
          – Windstorm1981
          Mar 23 at 17:46




          $begingroup$
          ha! Clear now. So fixing the cv fixes the data splits. Fixing the models fixes how the models handle the (fixed) data splits?
          $endgroup$
          – Windstorm1981
          Mar 23 at 17:46




          1




          1




          $begingroup$
          @Windstorm1981 Exactly!
          $endgroup$
          – Esmailian
          Mar 23 at 17:47




          $begingroup$
          @Windstorm1981 Exactly!
          $endgroup$
          – Esmailian
          Mar 23 at 17:47



          Popular posts from this blog

          Lowndes Grove History Architecture References Navigation menu32°48′6″N 79°57′58″W / 32.80167°N 79.96611°W / 32.80167; -79.9661132°48′6″N 79°57′58″W / 32.80167°N 79.96611°W / 32.80167; -79.9661178002500"National Register Information System"Historic houses of South Carolina"Lowndes Grove""+32° 48' 6.00", −79° 57' 58.00""Lowndes Grove, Charleston County (260 St. Margaret St., Charleston)""Lowndes Grove"The Charleston ExpositionIt Happened in South Carolina"Lowndes Grove (House), Saint Margaret Street & Sixth Avenue, Charleston, Charleston County, SC(Photographs)"Plantations of the Carolina Low Countrye

          random experiment with two different functions on unit interval Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)Random variable and probability space notionsRandom Walk with EdgesFinding functions where the increase over a random interval is Poisson distributedNumber of days until dayCan an observed event in fact be of zero probability?Unit random processmodels of coins and uniform distributionHow to get the number of successes given $n$ trials , probability $P$ and a random variable $X$Absorbing Markov chain in a computer. Is “almost every” turned into always convergence in computer executions?Stopped random walk is not uniformly integrable

          How should I support this large drywall patch? Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?How do I cover large gaps in drywall?How do I keep drywall around a patch from crumbling?Can I glue a second layer of drywall?How to patch long strip on drywall?Large drywall patch: how to avoid bulging seams?Drywall Mesh Patch vs. Bulge? To remove or not to remove?How to fix this drywall job?Prep drywall before backsplashWhat's the best way to fix this horrible drywall patch job?Drywall patching using 3M Patch Plus Primer