Competition: COVID19 Detection in Blood Exams
The world currently suffers from the global COVID19 pandemic.
Billions of people have been impacted, and millions of casualties have already occurred. Nevertheless, such numbers continue to increase as a new vaccine is still not made available.
Therefore, it is of extreme importance to identify individuals who are or have been contaminated by the SARS-CoV-2 virus.
Such an identification aids public health organizations and governments to plan actions to reduce the impacts of such a pandemic. In such a sense, Hilab is a remote laboratory company that performs dozens of types of blood exams, including serology tests for COVID19, where millions of exams have already been performed by the company in Brazil.
To improve the detection of such a virus, machine learning methods can be used to aid laboratory experts in the decision making process.
Therefore, this competition poses the difficult problem of building machine learning models with high confidence and accuracy for the detection of COVID19.
Macro F1 Score
Atish Kumar Dipongkor
Hilab is a remote laboratory company from Brazil which has thousands of blood scanner points distributed along the country, mostly on hospitals and pharmacies.
The company has received great attention in the country due to its fast exam time, including for the detection of COVID19, in which the patients receive the exam’s certificate in approximately 15minutes.
Once the blood sample is collected from the patient,it is digitalized and sent to a central laboratory in the city of Curitiba (Brazil), where expert biomedicians analyze the samples and indicate if the patient is infected.
Given the technology of the equipment, where blood sam-ples are digitalized, and the high number of exams, enoughdata has become available to build strong machine learning models.
Such models can aid the decision making process of the expert biomedicians, enabling a more accurate detection of blood infections.
To improve the detection of COVID19 in individuals, Hilab proposes a competition on the difficult task of classifying different types of contamination using available processed data collected with the blood scanner and labeled by the expert biomedicians.
Hilab will make available a single dataset, composed of thousands of labeled samples, which can be used by the competitors to train and validate their models. The dataset composes a binary classification problem, which must be correctly predicted given a multi-dimensional input. A link to the dataset will be sent to the competitors upon registration.
The submission consists on a python package (a template package will be made available) containing the source code for the preprocessing and the classification steps. Teams can be composed of a maximum of 5 individuals. Teams must submit their solutions according to the instructions received upon registration by May 31st.
C. Evaluation Metrics
Solutions will be evaluated with a test dataset, which is notavailable for the competitors. The final score will be assessedusing the Macro F1 score, computed as follows:
where TPRc is the recall and PPVc is the precision for class c.
Final results with scores for the competitors will be made available during the competition’s session at IJCNN, where the best solution (Highest F1 score) will receive a prize of US$ 1,000.00 (value to be split among the team).
Contest Public Notice
- Gabriela Steinhaus
- Guilherme Calesco
- Marcelo Cossetin
- Marcus Figueredo
- Raynne Andrade
- Victor Henrique