Session | ||
S47: Statistical software engineering in the pharmaceutical industry
| ||
Session Abstract | ||
60 minutes presentations followed by 40 minutes of discussion | ||
Presentations | ||
8:30am - 8:50am
First year of the Software Engineering working group - working together across organizations 1Hoffmann-La Roche Ltd, Switzerland; 2Gilead Sciences Inc., U.S.; 3Software Engineering Working Group The Software Engineering (SWE) Working Group (WG) was formed in August 2022 in the American Statistical Association (ASA) Biopharmaceutical Section (BIOP). The SWE WG facilitates cross-organizational collaboration with regular meetings, and currently includes more than 30 members from over 20 organizations. The primary goal of the SWE WG is to engineer R packages that implement important statistical methods to fill in gaps in the open-source statistical software landscape. The first R package “mmrm” is setting a new standard for fitting mixed models for repeated measures (MMRM) in R. The secondary goal is to develop and disseminate best practices for engineering high-quality open-source statistical software. The video series “Statistical Software Engineering 101” is introducing specific best practices in accessible format. Furthermore the workshop “Good Software Engineering Practice for R Packages” has been successfully taught in person at a BBS seminar, and the materials are available publicly to train statisticians on best practices. Communication is key, and the SWE WG was introduced in a BIOP report and maintains a website including a blog at https://rconsortium.github.io/asa-biop-swe-wg. The SWE WG plans to develop additional new R packages, covering critical and innovative methodology topics in the health-technology assessment (HTA) space, covariate adjustment and Bayesian inference for MMRMs. We describe the journey of the SWE WG so far and in particular the ingredients for working together successfully, including mutual interest, getting to know each other, and creating mutual trust. 8:50am - 9:10am
Refactoring and extending an existing R package across companies - learnings from the crmPack team 1Bayer AG, Germany; 2Roche; 3Merck Healthcare KGaA, Germany Working collaboratively across companies on an open-source project does not come naturally to many statisticians, statistical analysts and developers in the pharmaceutical industry. Although there has been an increasingly strong trend towards more knowledge sharing in recent years, the day-to-day work of many colleagues still consists of finding home-grown, customized solutions for problems and new statistical methods with proprietary software. Making crmPack an open-source collaboration project was an expedient next step when the need for a flexible approach for the simulation of Phase I dose escalation models came up in multiple companies in parallel. The compelling advantage of this package is its flexible framework that allows easy extension and enhancement of existing methods. A cross-company team, consisting of pharmaceutical companies, academic institutes, and clinical research organizations, learned to collaborate utilizing agile principles quickly. However, such a collaboration can be very diverse regarding educational background, expertise, and expectations. Initially, it might be challenging to establish a common understanding of how to contribute and a way forward, still guaranteeing a certain level of code quality and set-up reliable communication channels. We would like to share some learnings and best-practices we developed while we have been working together on the extension of crmPack. Initiatives like crmPack offer the possibility to reach industry-wide standards in the future which will enhance our work, join forces, and use the crowd intelligence to improve quality of code and saving efforts simultaneously while following pharmaceutical industry´s GxP. 9:10am - 9:30am
To package or not to package - a pragmatic approach to deciding whether an R package is the right solution for your problem and alternatives to consider Boehringer-Ingelheim, Germany In recent years, statisticians and analysts have increasingly adopted open-source software such as the R language. The success of R in statistics and beyond is in large part due to its extensive ecosystem of open-source extension packages. Writing R packages is thus an increasingly important skill for statisticians to bring novel methodology into practice or when trying to make their workflows more reproducible and effectively share code with collaborators. This talk addresses some potential downsides of wrapping R code in a package, like the burden of continued maintenance of a package, and highlights alternative formats of sharing functionality with collaborators. Many software engineering best-practices for R packages can be used outside of a full R package enabling a more adequate compromise between quality and simplicity depending on the application. In this talk, the role of scripts, functions, literate programming (R markdown and Quarto), R packages, application programming interfaces (API), and shiny apps in the R ecosystem are reviewed and guidance on how to select he right tool for a particular objective is given. Hands-on recommendations on where and how software engineering best-practices like version control, testing, or documentation can be implemented outside of an R package context are discussed. Knowledge of the broad range of options the R ecosystem offers for making statistical analysis code available can both lower the entry hurdle to newcomers to the R language and further increase the impact of more advanced R users. |