- Hotfix adding schema to DatabaseConnector::getTableNames when creating results tables
- Add support for R4.4
- Fix notes around documentation (vignette engine and brackets in itemize)
- Use webp image format where possible (not in pdfs) for smaller size
- Make sure random table names are unique in tests
- Remove remote info for Eunomia since it’s in CRAN
- Clean up dependencies, tibble removed and IHT and ParallelLogger from CRAN
- Use cohortIds for cohortCovariates to comply with FeatureExtraction
- Add cdmDatabaseName from DatabaseDetails to model output
- Fix bug when attributes weren’t preserved on trainData$covariateData after split
- Fix warnings in tests and speed them up
- Fix bug in assignment operator in configurePython
- Delay evaluation of plpData when using do.call like in learningCurves and runMultiplePlp
- Speed up population generation when subjectId’s are distinct
- Fix bug when population was still generated when provided to runPlp
- fix bug with ohdsi shiny modules version check (issue 415)
- Fix sklearnToJson to be compatible with scikit-learn>=1.3
- Fix github actions so it’s not hardcoded to use python 3.7
- added spline feature engineering
- added age/sex stratified imputation feature engineering
- changed result table execution date types to varchar
- updated covariateSummary to use feature engineering
- fixed bug introduced with new reticulate update in model saving to json tests
- fixed bug with database insert if result is incomplete
- updated/fixed documentation (Egill)
- added model path to models (Henrik)
- updated hyper-parameter saving to data.frame and made consistent
- fixed bug with multiple covariate settings in diagnose plp
- added min cell count when exporting database results to csv files
- light GBM added (thanks Jin Choi and Chungsoo Kim)
- fixed minor bugs when uploading results to database
- added ensure_installed(“ResultModelManager”) to getDataMigrator()
- shiny app is now using ShinyAppBuilder with a config saved in the /inst folder
- fixed bugs introduced when sklearn inputs changed
- added sklearn model being saved as jsons
- made changes around the DatabaseConnection get table names function to make it work for the updated DatabaseConnection
- removed check RAM stop (now it just warns)
- Updated test to skip test for FE setting if the model does not fit (this was causing occasional test fail)
- replaced .data$ with “” for all dplyr::select to remove warnings
- Fix bug with python type being required to be int
- Allow priorType to be passed down to getCV function in case prior is not ‘laplace’
- Seed specified in Cyclops model wasn’t passed to Cyclops
- fixed issue with shiny viewer converting connection details to large json
- added check for cdmDatabaseId into createDatabaseDetails
- added test for check for cdmDatabaseId into createDatabaseDetails to error when NULL
- removed session$onSessionEnded(shiny::stopApp) from shiny server
- forcing cdmDatabaseId to be a string if integer is input
- replaced utils::read.csv with readr::read_csv when inserting results from csv
- replaced gsub with sub when inserting csvs to database
- saved result specification csv in windows to fix odd formating issue
- fixed sample data bugs
- updated to use v1.0.0 of OhdsiShinyModules
- updated plp database result tables to use the same structure for cohort and database as other HADES packages
- added function to insert csv results into plp database result tables
- added input for databaseId (database and version) when extracting data to be consistent with other HADES packages. This is saved in plp objects.
- fixed issue with ‘preprocess’ vs ‘preprocessing’ inconsistently used across models
- added metaData tracking for feature engineering or preprocessing when predicting
- fixed issue with FE using trainData$covariateData metaData rather than trainData
- fixed bug when using sameData for FE
- pulled in multiple bug fixes and test improvements from Egill
- pulled in fix for learning curves from Henrik
- Pulled in fix for feature engineering from Solomon
- Cleaned check messages about comparing class(x) with a string by changing to inherits()
- removed json saving for sklearn models since sklearn-json is no longer working for the latest sklearn
- renamed the input corresponding to the string that gets appended to the results table names to tablePrefix
- fixed issues with system.file() from SqlRender code breaking the tests
- added an input fileAppend to the function that exports the database tables to csv files
- moved the plp model (including preprocessing details) outside of the result database (into a specified folder) due to the size of the objects (too large to insert into the database).
- added saving of plp models into the result database
- added default cohortDefinitions in runMultiplePlp
- added modelType to all models for database upload
- moved FeatureExtraction to depends
- fixed using inherits()
- moved most of the shiny app code into OhdsiShinyModules
- removed shiny dependencies and added OhdsiShinyModules to suggests
- fixed bug with linux sklearn saving
- replaced cohortId to targetId for consistency throughout code
- replaced targetId in model design to cohortId for consistency throughout code
- replaced plpDataSettings to restrictPlpDataSettings to improve naming consistency
- added ability to use initial population in runPlp by adding the population to plpData$population
- added splitSettings into modelDesign
- replaced saving json settings with ParallelLogger function
- updated database result schema (removed researcher_id from tables - if desired a new table with the setting_ids and researcher_id could be added, removed study tables and revised results table to performances table with a reference to model_design_id and development_database_id to enable validation results without a model to be inserted)
- added diagnostic code based on PROBAST
- added diagnostic shiny module
- added code to create sqlite database and populate in uploadToDatabase
- add code to convert runPlp+val to sqlite database when viewing shiny
- added code to extract database results into csv files: extractDatabaseToCsv()
- pulled in GBM update (default hyper-parameters and variable importance fix) work done by Egill (egillax)
- updated installation documents
- added tryCatch around plots to prevent code stopping
- updated result schema (added model_design table with settings and added attrition table)
- updated shiny app for new database result schema
- removed C++ code for AUC and Rcpp dependency, now using pROC instead as faster
- made covariate summary optional when externally validating
- updated json structure for specifying study design (made it friendlier to read)
- includes smooth calibration plot fix - work done by Alex (rekkasa)
- fixed bug with multiple sample methods or feature engineering settings causing invalid error
- plpModel now saved as json files when possible
- Updated runPlp to make more modular
- now possible to customise data splitting, feature engineering, sampling (over/under) and learning algorithm
- added function for extracting cohort covariates
- updated evalaution to evaluate per strata (evaluation column)
- updated plpModel structure
- updated runPlp structure
- updated shiny and package to use tidyr and not reshape2
- sklearn learning algorithms share the same fit function
- r learning algorithms share the same fit function
- interface to cyclops code revised
- ensemble learning removed (will be in separate package)
- deep learning removed (will be in DeepPatientLevelPrediction package)
- revised toSparseM() to do conversion in one go but check RAM availablility beforehand.
- removed temporal plpData conversion in toSparseM (this will be done in DeepPatientLevelPrediction)
- shiny can now read csv results
- objects loaded via loadPlpFromCsv() can be saved using savePlpResult()
- added database result storage
- added interface to database results in shiny
- merged in shinyRepo that changed the shiny app to make it modular and added new features
- removed deep learning as this is being added into new OHDSI package DeepPatientLevelPrediction
- save xgboost model as json file for transparency
- set connectionDetails to NULL in getPlpData
- updated andromeda functions - restrict to pop and tidy covs for speed
- quick fix for GBM survival predicting negative values
- fixed occasional demoSum error for survival models
- updated index creation to use Andromeda function
- fixed bug when normalize data is false
- fixed bugs when single feature (gbm + python)
- updated GBM
- updated calibration slope
- fixed missing age/gender in prediction
- fixed shiny intercept bug
- fixed diagnostic
- fixed missing covariateSettings in load cvs plp
- Removed plpData from evaluation
- Added recalibration into externalVal
- Updated shiny app for recalibration
- Added population creation setting to use cohortEndDate as timeAtRisk end
- fixed tests
- Reduced imports by adding code to install some dependencies when used
- fixed csv result saving bug when no model param
- fixed r check vignette issues
- added conda install to test
- finalised permutation feature importance
- fixed deepNN index issue (reported on github - thanks dapritchard)
- add compression to python pickles
- removed requirement to have outcomeCount for prediction with python models
- cleaned all checks
- fixed bug in python toSparseMatrix
- fixed warning in studyPop
- fixed bug (identified by Chungsoo) in covariateSummary
- fixed bug with thresholdSummary
- edited threshold summary function to make it cleaner
- added to ensemble where you can combine multiple models into an ensemble
- cleaned up the notes and tests
- updated simulated data covariateId in tests to use integer64
- fixed description imports (and sorted them)
- fixed Cox model calibration plots
- fixed int64 conversion bug
- added baseline risk to Cox model
- updated shiny: added attrition and hyper-parameter grid search into settings
- updated shiny app added 95% CI to AUC in summary, size is now complete data size and there is a column valPercent that tells what percentage of the data were used for validation
- updated GBMsurvival to use survival metrics and c-stat
- added updates and fixes into master from development branch
- fixed bug with pdw data extraction due to multiple person_id columns
- fixed bug in shiny app converting covariate values due to tibble
- added calibration updates: cal-in-large, weak cal
- updated smooth cal plot (sample for speed in big data)
- defaulted to 100 values in calibrationSummary + updated cal plot
- fixed backwards compat with normalization
- fixed python joblib dependancy
- fixed bug in preprocessing
- added cross validation aucs to LR, GBM, RF and MLP
- added more settings into MLP
- added threads option in LR
- fixed minor bug with shiny dependency
- fixed some tests
- added standardizedMeanDiff to covariatesummary
- updated createStudyPopulation to make it cleaner to read and count outcome per TAR
- Andromeda replaced ff data objects
- added age/gender into cohort
- fixed python warnings
- updated shiny plp viewer
- Fixed bug when running multiple analyses using a data extraction sample with multiple covariate settings
- improved shiny PLP viewer
- added diagnostic shiny viewer
- updated external validate code to enable custom covariates using ATLAS cohorts
- fixed issues with startAnchor and endAnchor
- Deprecating addExposureDaysToStart and addExposureDaysToEnd arguments in createStudyPopulation, adding new arguments called startAnchor and endAnchor. The hope is this is less confusing.
- fixed transfer learning code (can now transfer or fine-tune model)
- made view plp shiny apps work when some results are missing
- set up testing
- fixed build warnings
- added tests to get >70% coverage (keras tests too slow for travis)
- Fixed minor bugs
- Fixed deep learning code and removed pythonInR dependancy
- combined shiny into one file with one interface
- added recalibration using 25% sample in existing models
- added option to provide score to probabilities for existing models
- fixed warnings with some plots
Small bug fixes: - added analysisId into model saving/loading - made external validation saving recursive - added removal of patients with negative TAR when creating population - added option to apply model without preprocessing settings (make them NULL) - updated create study population to remove patients with negative time-at-risk
Changes: - merged in bug fix from Martijn - fixed AUC bug causing crash with big data - update SQL code to be compatible with v6.0 OMOP CDM - added save option to external validate PLP
Changes: - Updated splitting functions to include a splitby subject and renamed personSplitter to randomSplitter - Cast indices to integer in python functions to fix bug with non integer sparse matrix indices
Changes: - Added GLM status to log (will now inform about any fitting issue in log) - Added GBM survival model (still under development) - Added RF quantile regression (still under development) - Updated viewMultiplePlp() to match PLP skeleton package app - Updated single plp vignette with additional example - Merge in deep learning updates from Chan
Changes: - Updated website
Changes: - Added more tests - test files now match R files
Changes: - Fixed ensemble stacker
Changes: - Using reticulate for python interface - Speed improvements - Bug fixes