<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>software engineering Archives - Ger Inberg</title>
	<atom:link href="https://gerinberg.com/category/software-engineering/feed/" rel="self" type="application/rss+xml" />
	<link>https://gerinberg.com/category/software-engineering/</link>
	<description>data science developer</description>
	<lastBuildDate>Tue, 03 Sep 2024 12:41:39 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.8.1</generator>

<image>
	<url>https://gerinberg.com/wp-content/uploads/2017/05/favicon-150x150.jpg</url>
	<title>software engineering Archives - Ger Inberg</title>
	<link>https://gerinberg.com/category/software-engineering/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>FeatureExtraction on CRAN</title>
		<link>https://gerinberg.com/2024/09/03/featureextraction-on-cran/</link>
		
		<dc:creator><![CDATA[Ger]]></dc:creator>
		<pubDate>Tue, 03 Sep 2024 12:40:01 +0000</pubDate>
				<category><![CDATA[data science]]></category>
		<category><![CDATA[software engineering]]></category>
		<guid isPermaLink="false">https://gerinberg.com/?p=1861</guid>

					<description><![CDATA[<p>Feature engineering is a crucial step in the data science process, often making the difference between a good model and a great one. It involves transforming raw data into meaningful features that can improve the performance of predictive models. For those working in R, the [&#8230;]</p>
<p>The post <a href="https://gerinberg.com/2024/09/03/featureextraction-on-cran/">FeatureExtraction on CRAN</a> appeared first on <a href="https://gerinberg.com">Ger Inberg</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>Feature engineering is a crucial step in the data science process, often making the difference between a good model and a great one. It involves transforming raw data into meaningful features that can improve the performance of predictive models. For those working in R, the <strong>FeatureExtraction</strong> package on <a href="https://cran.r-project.org/web/packages/FeatureExtraction/index.html" target="_blank" rel="noopener">CRAN</a> offers a powerful and flexible toolset for automating and streamlining this process.</p>
<p>Originally developed as part of the OHDSI (Observational Health Data Sciences and Informatics) ecosystem, <strong>FeatureExtraction</strong> is particularly well-suited for working with large-scale observational data. In this post, we&#8217;ll explore the package in detail, focusing on its core features, practical applications, and a step-by-step example to help you get started.</p>
<h3>Key Features and Capabilities</h3>
<h4>1. <strong>Automated Feature Generation</strong></h4>
<p><strong>FeatureExtraction</strong> excels at automatically generating a wide range of features from raw data. These features include basic demographic variables like age and gender, as well as more complex attributes derived from longitudinal data, such as the frequency of medical visits or the presence of certain conditions over time.</p>
<h4>2. <strong>Temporal Features</strong></h4>
<p>Temporal data, such as patient histories or time-dependent events, are common in many fields, especially healthcare. <strong>FeatureExtraction</strong> handles temporal data adeptly, allowing users to define time windows relative to key events (e.g., diagnosis dates). This feature is crucial for creating time-sensitive covariates that capture trends and patterns in data over specified periods.</p>
<h4>3. <strong>Custom Feature Extraction</strong></h4>
<p>While the package offers extensive automated capabilities, it also allows for custom feature extraction. Users can define custom covariates and specify how these should be generated from the underlying data, incorporating domain-specific knowledge into the feature engineering process.</p>
<h4>4. <strong>Scalability</strong></h4>
<p>Feature engineering can become computationally intensive, particularly with large datasets. <strong>FeatureExtraction</strong> is designed for scalability, leveraging parallel processing and optimized algorithms to ensure that feature extraction remains efficient even with big data.</p>
<h4>5. <strong>Integration with OHDSI Tools</strong></h4>
<p>As part of the OHDSI ecosystem, <strong>FeatureExtraction</strong> integrates seamlessly with other tools like <strong>PatientLevelPrediction</strong> and <strong>CohortMethod</strong>, enabling a smooth workflow from data extraction to model building and analysis.</p>
<h4>Getting started</h4>
<p>Installing the <strong>FeatureExtraction</strong> package is straightforward. You can install it directly from CRAN using the following command:</p>
<div class="dark bg-gray-950 contain-inline-size rounded-md border-[0.5px] border-token-border-medium">
<div class="flex items-center relative text-token-text-secondary bg-token-main-surface-secondary px-4 py-2 text-xs font-sans justify-between rounded-t-md"><code>install.packages("FeatureExtraction")</code></div>
</div>
<div></div>
<div>Load the package in your R session:</div>
<div></div>
<div><code>library(FeatureExtraction)</code></div>
<div>
<h3>Practical Example: Creating Covariates Based on Other Cohorts</h3>
<p>To illustrate how <strong>FeatureExtraction</strong> can be applied, let&#8217;s walk through an example where we create covariates based on the presence of patients in other cohorts. This is particularly useful in studies where the relationship between different conditions or treatments over time is of interest.</p>
<h4>Step 1: Setting Up the Database Connection</h4>
<p>First, we need to define the connection to our CDM-compliant database:</p>
<p><code>connectionDetails &lt;- createConnectionDetails(dbms = "postgresql",</code><br />
<code>server = "your_server",</code><br />
<code>user = "your_username",</code><br />
<code>password = "your_password")</code><br />
<code>cdmDatabaseSchema &lt;- "your_cdm_schema"</code><br />
<code>cohortDatabaseSchema &lt;- "your_cohort_schema"</code></p>
<h4>Step 2: Define the Cohorts of Interest</h4>
<p>Assume we have a cohort of patients with diabetes and another cohort with a history of cardiovascular disease. We want to create a feature that indicates whether a patient in the diabetes cohort has a prior history of cardiovascular disease.</p>
<p><code># Define cohort IDs (these would be predefined in your database)</code><br />
<code>diabetesCohortId &lt;- 1</code><br />
<code>cvdCohortId &lt;- 2</code></p>
<h4>Step 3: Create the Feature Extraction Settings</h4>
<p>Next, we define the feature extraction settings, specifying that we want to create covariates based on the presence of patients in the cardiovascular disease cohort:</p>
<p><code>covariateSettings &lt;- createCohortBasedCovariateSettings(useDemographicsGender = TRUE,</code><br />
<code>useDemographicsAge = TRUE,</code><br />
<code>cohortId = cvdCohortId,</code><br />
<code>startDay = -365,</code><br />
<code>endDay = 0)</code></p>
<p>In this example, the <code>startDay</code> and <code>endDay</code> parameters define a time window of one year prior to the cohort&#8217;s index date. This means the feature will reflect whether a patient was in the cardiovascular disease cohort within one year before the index date.</p>
<h4>Step 4: Extract the Features</h4>
<p>Now, we extract the features for the diabetes cohort using the settings we defined:</p>
<p><code>covariateData &lt;- getDbCovariateData(connectionDetails = connectionDetails,</code><br />
<code>cdmDatabaseSchema = cdmDatabaseSchema,</code><br />
<code>cohortDatabaseSchema = cohortDatabaseSchema,</code><br />
<code>cohortTable = "cohort",</code><br />
<code>cohortId = diabetesCohortId,</code><br />
<code>covariateSettings = covariateSettings)</code></p>
<p>This function retrieves the covariate data for the specified cohort, based on the feature extraction settings we provided.</p>
<h4>Step 5: Use the Extracted Features</h4>
<p>The extracted features are now available in the <code>covariateData</code> object, which can be used for further analysis, such as model building or cohort characterization.</p>
</div>
<p><code># Explore the covariate data</code><br />
<code>summary(covariateData)</code></p>
<p>This simple example demonstrates how <strong>FeatureExtraction</strong> can be used to create meaningful features based on different cohorts. The package&#8217;s flexibility and scalability make it a powerful tool for a wide range of applications, from small-scale studies to large observational databases.</p>
<p>The post <a href="https://gerinberg.com/2024/09/03/featureextraction-on-cran/">FeatureExtraction on CRAN</a> appeared first on <a href="https://gerinberg.com">Ger Inberg</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>DrugExposure Diagnostics</title>
		<link>https://gerinberg.com/2023/04/01/drugexposurediagnostics/</link>
		
		<dc:creator><![CDATA[Ger]]></dc:creator>
		<pubDate>Sat, 01 Apr 2023 09:32:00 +0000</pubDate>
				<category><![CDATA[data analysis]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[DrugExposureDiagnostics]]></category>
		<guid isPermaLink="false">https://gerinberg.com/?p=1848</guid>

					<description><![CDATA[<p>DrugExposureDiagnostics: A Comprehensive R Package for Assessing Drug Exposure in Clinical Research Drug exposure is an essential aspect of clinical research, as it directly affects the efficacy and safety of drugs. Measuring drug exposure accurately and understanding the factors that influence it is crucial for [&#8230;]</p>
<p>The post <a href="https://gerinberg.com/2023/04/01/drugexposurediagnostics/">DrugExposure Diagnostics</a> appeared first on <a href="https://gerinberg.com">Ger Inberg</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="flex-1 overflow-hidden">
<div class="react-scroll-to-bottom--css-louzy-79elbk h-full dark:bg-gray-800">
<div class="react-scroll-to-bottom--css-louzy-1n7m0yu">
<div class="flex flex-col items-center text-sm dark:bg-gray-800">
<div class="group w-full text-gray-800 dark:text-gray-100 border-b border-black/10 dark:border-gray-900/50 bg-gray-50 dark:bg-[#444654]">
<div class="text-base gap-4 md:gap-6 md:max-w-2xl lg:max-w-2xl xl:max-w-3xl p-4 md:py-6 flex lg:px-0 m-auto">
<div class="relative flex w-[calc(100%-50px)] flex-col gap-1 md:gap-3 lg:w-[calc(100%-115px)]">
<div class="flex flex-grow flex-col gap-3">
<div class="min-h-[20px] flex flex-col items-start gap-4 whitespace-pre-wrap">
<div class="markdown prose w-full break-words dark:prose-invert light">
<div class="flex-1 overflow-hidden">
<div class="react-scroll-to-bottom--css-louzy-79elbk h-full dark:bg-gray-800">
<div class="react-scroll-to-bottom--css-louzy-1n7m0yu">
<div class="flex flex-col items-center text-sm dark:bg-gray-800">
<div class="group w-full text-gray-800 dark:text-gray-100 border-b border-black/10 dark:border-gray-900/50 bg-gray-50 dark:bg-[#444654]">
<div class="text-base gap-4 md:gap-6 md:max-w-2xl lg:max-w-2xl xl:max-w-3xl p-4 md:py-6 flex lg:px-0 m-auto">
<div class="relative flex w-[calc(100%-50px)] flex-col gap-1 md:gap-3 lg:w-[calc(100%-115px)]">
<div class="flex flex-grow flex-col gap-3">
<div class="min-h-[20px] flex flex-col items-start gap-4 whitespace-pre-wrap">
<div class="markdown prose w-full break-words dark:prose-invert light">
<p>DrugExposureDiagnostics: A Comprehensive R Package for Assessing Drug Exposure in Clinical Research</p>
<p>Drug exposure is an essential aspect of clinical research, as it directly affects the efficacy and safety of drugs. Measuring drug exposure accurately and understanding the factors that influence it is crucial for clinical decision-making. This is where the R package DrugExposureDiagnostics comes in handy.</p>
<p>As the author of this R package, I am excited to introduce you to this powerful tool for analyzing drug exposure data. Before delving into the package, let&#8217;s first understand what drug exposure is and why it is crucial in clinical research.</p>
<p>Drug exposure refers to the extent to which a drug enters and stays in the body, thereby producing its intended therapeutic effects. Measuring drug exposure accurately involves capturing key metrics, such as drug concentrations, AUC, Cmax, and Tmax. By doing so, researchers can evaluate drug efficacy and safety and make informed decisions regarding dosing and administration.</p>
<p>One way to capture drug exposure data is through the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), developed by the Observational Health Data Sciences and Informatics (OHDSI) community. The OMOP CDM standardizes and integrates data from various sources, allowing for large-scale observational studies and analysis.</p>
<p>This is where the R package DrugExposureDiagnostics comes in. It is a comprehensive tool for analyzing drug exposure data in the OMOP CDM format. The package includes functions for calculating various exposure metrics, handling missing data, and summarizing data at different levels, such as by subject or visit. Additionally, it provides tools for identifying outliers and comparing exposure between groups.</p>
<p>DrugExposureDiagnostics has been extensively tested and validated, ensuring that it produces accurate results. The package has been released on the <a href="https://cran.r-project.org/web/packages/DrugExposureDiagnostics/index.html">Comprehensive R Archive Network</a> (CRAN), making it easily accessible to R users worldwide. To use the package, simply install it using the install.packages() function in R and load it using the library() function.</p>
<p>If you are interested in learning more about DrugExposureDiagnostics or trying it out for yourself, visit the <a href="https://github.com/darwin-eu/DrugExposureDiagnostics">package github</a></p>
</div>
</div>
</div>
<div class="flex justify-between">
<div class="text-gray-400 flex self-end lg:self-center justify-center mt-2 gap-3 md:gap-4 lg:gap-1 lg:absolute lg:top-0 lg:translate-x-full lg:right-0 lg:mt-0 lg:pl-2 visible"></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="absolute bottom-0 left-0 w-full border-t md:border-t-0 dark:border-white/20 md:border-transparent md:dark:border-transparent md:bg-vert-light-gradient bg-white dark:bg-gray-800 md:!bg-transparent dark:md:bg-vert-dark-gradient pt-2">
<form class="stretch mx-2 flex flex-row gap-3 last:mb-2 md:mx-4 md:last:mb-6 lg:mx-auto lg:max-w-3xl">
<div class="relative flex h-full flex-1 md:flex-col">
<div class="flex flex-col w-full py-2 flex-grow md:py-3 md:pl-4 relative border border-black/10 bg-white dark:border-gray-900/50 dark:text-white dark:bg-gray-700 rounded-md shadow-[0_0_10px_rgba(0,0,0,0.10)] dark:shadow-[0_0_15px_rgba(0,0,0,0.10)]"></div>
</div>
</form>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<p>The post <a href="https://gerinberg.com/2023/04/01/drugexposurediagnostics/">DrugExposure Diagnostics</a> appeared first on <a href="https://gerinberg.com">Ger Inberg</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Web scraping: R vs python</title>
		<link>https://gerinberg.com/2016/02/09/web-scraping-r-vs-python/</link>
					<comments>https://gerinberg.com/2016/02/09/web-scraping-r-vs-python/#comments</comments>
		
		<dc:creator><![CDATA[Ger]]></dc:creator>
		<pubDate>Tue, 09 Feb 2016 08:08:48 +0000</pubDate>
				<category><![CDATA[data science]]></category>
		<category><![CDATA[software engineering]]></category>
		<guid isPermaLink="false">http://inbergict.nl/blog/?p=78</guid>

					<description><![CDATA[<p>For my last post, I used a python script to scrape the data from a website. I used Python..just because I am used to do webscraping in Python. But I heard R also got better at scraping, so I rewrote my script in R. The package [&#8230;]</p>
<p>The post <a href="https://gerinberg.com/2016/02/09/web-scraping-r-vs-python/">Web scraping: R vs python</a> appeared first on <a href="https://gerinberg.com">Ger Inberg</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>For my last post, I used a python script to scrape the data from a website. I used Python..just because I am used to do webscraping in Python. But I heard R also got better at scraping, so I rewrote my script in R.</p>
<p>The package <strong><a href="http://blog.rstudio.org/2014/11/24/rvest-easy-web-scraping-with-r/">rvest</a></strong> is the equivalent of BeautifulSoup in python. It is available since 2014 and created by Hadley Wickham. Underneath it uses the packages &#8216;httr&#8217; and &#8216;xml2&#8217; to easily download and manipulate html content.</p>
<p>You can use rvest in the following way:</p>
<p>[code language=&#8221;r&#8221; wraplines=&#8221;true&#8221; collapse=&#8221;false&#8221;]<br />
# install and load package<br />
install.packages(&#8220;rvest&#8221;)<br />
library(rvest)</p>
<p>url &lt;- &#8220;http://live.ultimate.dk/desktop/front/?eventid=2021049&amp;language=nl&#8221;<br />
data &lt;- read_html(url)<br />
resultsTable &lt;- data %&gt;% html_nodes(&#8220;table.leaderboard_table_results&#8221;)<br />
rows &lt;- resultsTable %&gt;% html_nodes(&#8220;tr&#8221;)<br />
for(i in 1:length(rows)){<br />
tds &lt;- rows[i] %&gt;% html_nodes(&#8220;td&#8221;)<br />
print(tds[4] %&gt;% html_text)<br />
print(tds[10] %&gt;% html_text)<br />
}<br />
[/code]</p>
<p>A couple of things are good to know:</p>
<ul>
<li>get the website content with read_html(&lt;URL&gt;): this will return an xml document</li>
<li>select content from certain nodes with html_nodes(&lt;element&gt;.&lt;classname&gt;)</li>
<li>get attribute content  from a node with html_attr(&lt;name&gt;)</li>
<li>the pipe operator &#8220;%&gt;%&#8221; can be used to chain operations. Use it, it&#8217;s very convenient.</li>
</ul>
<p>When you are used to BeautifulSoup, it is easy to learn rvest, because it has a similair syntax.</p>
<p>You can find the R and python scripts that I wrote for webscraping below. I am wondering what language you prefer for webscraping? Please let me know in a comment below.</p>
<p><a href="https://github.com/ginberg/weissensee/blob/master/weissensee_scraper.py">python script</a> vs <a href="https://github.com/ginberg/weissensee/blob/master/weissensee_scraper.R">R script</a></p>
<p>The post <a href="https://gerinberg.com/2016/02/09/web-scraping-r-vs-python/">Web scraping: R vs python</a> appeared first on <a href="https://gerinberg.com">Ger Inberg</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://gerinberg.com/2016/02/09/web-scraping-r-vs-python/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title>Internet of Things developersday</title>
		<link>https://gerinberg.com/2015/04/19/internet-of-things-developersday/</link>
		
		<dc:creator><![CDATA[Ger]]></dc:creator>
		<pubDate>Sun, 19 Apr 2015 20:08:57 +0000</pubDate>
				<category><![CDATA[software engineering]]></category>
		<guid isPermaLink="false">http://inbergict.nl/blog/?p=25</guid>

					<description><![CDATA[<p>Last week, I have been to the developersday in the Jaarbeurs in Utrecht. The theme of the meeting was Internet of Things. PKN has accounced it&#8217;s IOT academy today. Some interesting speakers and food for thought. The internet of things is not really new, it [&#8230;]</p>
<p>The post <a href="https://gerinberg.com/2015/04/19/internet-of-things-developersday/">Internet of Things developersday</a> appeared first on <a href="https://gerinberg.com">Ger Inberg</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>Last week, I have been to the <a href="http://www.developersday.nl/">developersday</a> in the Jaarbeurs in Utrecht. The theme of the meeting was Internet of Things. PKN has accounced it&#8217;s <a href="http://iotacademy.nl/">IOT academy</a> today.<br />
Some interesting speakers and food for thought. The internet of things is not really new, it is more an umbrella term for Ambient Intelligence, Ubiquitous Computing, Sensor Networks, Grid Computing, Service-oriented architectures and Mobile Communications to name a few<br />
It is also mentioned a lot in combination with big data and data analysis because if we connect al those devices to the internet and put sensors in it, we will have a lot of data to process.<br />
Why do we want to connect all our devices to the internet? There must be some added value for <strong>us</strong> (and not only feeding data to the big comppanies like google, facebook, etc)<br />
A nice example of the added value of connected devices is the control of <a href="http://www.newscenter.philips.com/main/standard/news/press/2015/20150408-los-angeles-becomes-first-city-in-the-world-to-control-its-street-lighting-through-mobile-and-cloud-based-technologies-from-philips.wpd#.VTQFwCe1FBc">street lighting in Los Angeles by Philips</a></p>
<p>During they day it was also possible to do some workshops. I did a workshop to connect a temperature sensor to an arduino. By connecting the arduino to our labtop, we were able to display the temperature and humidity.</p>
<p><a href="http://inbergict.nl/blog/wp-content/uploads/2015/04/IMAG0342.jpg"><img decoding="async" src="http://inbergict.nl/blog/wp-content/uploads/2015/04/IMAG0342-300x170.jpg" alt="IMAG0342" width="250" height="170" class="alignnone size-medium wp-image-29" /></a><br />
<a href="http://inbergict.nl/blog/wp-content/uploads/2015/04/IMAG0343.jpg"><img decoding="async" src="http://inbergict.nl/blog/wp-content/uploads/2015/04/IMAG0343-300x170.jpg" alt="IMAG0343" width="250" height="170" class="alignnone size-medium wp-image-30" /></a><br />
<a href="http://inbergict.nl/blog/wp-content/uploads/2015/04/IMAG0344.jpg"><img decoding="async" src="http://inbergict.nl/blog/wp-content/uploads/2015/04/IMAG0344-300x170.jpg" alt="IMAG0344" width="250" height="170" class="alignnone size-medium wp-image-31" /></a><br />
<a href="http://inbergict.nl/blog/wp-content/uploads/2015/04/arduino.jpg"><img loading="lazy" decoding="async" src="http://inbergict.nl/blog/wp-content/uploads/2015/04/arduino-300x171.jpg" alt="arduino" width="250" height="171" class="alignnone size-medium wp-image-26" /></a></p>
<p>The post <a href="https://gerinberg.com/2015/04/19/internet-of-things-developersday/">Internet of Things developersday</a> appeared first on <a href="https://gerinberg.com">Ger Inberg</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
