Draft:Focal Loss
Submission declined on 12 December 2025 by ChrysGalley (talk).
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
|
Comment: It is a lot easier to understand the sources than the article. The Background section is particularly badly phrased, and the article was possibly constructed with assistance from AI / LLM. The Lead section presumed that someone was either familiar with, or would follow through on, the Tsung-Yi Lin (etc) source. Instead the Lead should permit an educated reader to follow along, assisted by wikilinks if necessary. See WP:EXPLAINLEAD.I can see a relatively straightforward rewrite, by a human, could make this article fully understandable, with the possible exception of the Formulation section.The sources show a "primary class" error, which looks to be an arXiv category identifier error. I'm not sure how a human would make this error, since it's not a parameter on "cite journal". ChrysGalley (talk) 16:24, 12 December 2025 (UTC)
Comment: Make sure to WP:Make technical articles understandable. Commandant Quacks-a-lot (talk) 19:34, 10 December 2025 (UTC)
Focal loss is a loss function used in machine learning to address class imbalance in classification tasks, especially in dense object detection problems. It was introduced by Lin et al. (2017) in the RetinaNet architecture.[1] Currently, focal loss is often a preferred training loss, especially in computer vision tasks such as classification and segmentation.
Background
In classification problems with a significant imbalance between foreground and background classes, standard loss functions such as cross-entropy may be dominated by easy, majority class examples. As a result, the model may learn poorly on hard or minority class examples. Focal loss modifies the standard cross-entropy loss to focus training on hard examples and down-weight the contribution of easy ones.[1]
Mathematical Formulation
Let be the model's estimated probability for the true class label where is the number of classes. The focal loss is defined as: [1]
where:
- is the focusing parameter that adjusts the rate at which easy examples are down-weighted,
- is a weighting factor to address class imbalance.
When , focal loss reduces to the standard cross-entropy loss. Larger values of place more focus on hard, misclassified examples.
The term is referred to as the focusing factor. It down-weights examples that the model already classifies with high confidence (that is, when ) and preserves the contribution of harder, misclassified examples (when ). As increases, the loss for well-classified samples is substantially reduced, while the loss for hard samples is affected much less. This focusing effect reshapes the gradient so that training emphasizes difficult examples and mitigates the dominance of abundant "easy" negatives in settings with severe class imbalance.

Applications
Focal loss was originally proposed for use in the RetinaNet[1]? architecture, which achieved state-of-the-art performance in object detection on benchmarks such as COCO. It has since been adopted in a variety of tasks, including:
- Dense object detection[1]
- Semantic segmentation[2]
- Medical image analysis [3]
- Multi-class classification under imbalance [4]
Focal loss has also found application outside of machine learning in areas such as lossy compression.[5]
Variants and Generalizations
Several extensions of focal loss have been proposed to address specific types of class imbalance or to adapt the loss to different prediction tasks.
- Generalized focal loss with tunable curvature for other divergence measures[6]
- Focal Tversky loss for segmentation tasks[7]. The focal Tversky loss replaces the cross-entropy base with the Tversky index, making it particularly effective for medical image segmentation where foreground regions are small and highly asymmetric.
- Asymmetric focal loss for multi-label classification[8]. Asymmetric focal loss introduces different focusing strengths for positive and negative classes, allowing the loss to penalize false negatives more heavily than false positives.
See also
References
- ^ a b c d e Lin, Tsung-Yi; Goyal, Priya; Girshick, Ross; He, Kaiming; Dollár, Piotr (2017). "Focal Loss for Dense Object Detection". 2017 IEEE International Conference on Computer Vision (ICCV). IEEE. pp. 2980–2988. doi:10.1109/ICCV.2017.324.
- ^ Azad, Reza; Heidari, Moein; Yilmaz, Kadir; Hüttemann, Michael; Karimijafarbigloo, Sanaz; Wu, Yuli; Schmeink, Anke; Merhof, Dorit (2023). "Loss Functions in the Era of Semantic Segmentation: A Survey and Outlook". arXiv:2312.05391 [cs.CV].
{{cite arXiv}}: Unknown parameter|primaryclass=ignored (help) - ^ Abraham, Nabajit; Khan, Nilanjan (2019). "A Novel Focal Tversky Loss Function With Improved Attention U-Net for Lesion Segmentation". IEEE International Symposium on Biomedical Imaging (ISBI).
- ^ Krawczyk, Bartosz (2016). "Learning from Imbalanced Data: Open Challenges and Future Directions". Progress in Artificial Intelligence. 5 (4): 221–232. doi:10.1007/s13748-016-0094-0.
- ^ Dytso, Alex; Cardone, Martina (2025). "Lossy Source Coding with Focal Loss". arXiv:2504.19913 [cs.IT].
{{cite arXiv}}: Unknown parameter|archive-date=ignored (help); Unknown parameter|primaryclass=ignored (help) - ^ Kimura, Masanari; Naganuma, Hiroki (2025). "Geometric Insights into Focal Loss: Reducing Curvature for Enhanced Model Calibration". Pattern Recognition Letters. 189: 195. arXiv:2405.00442. Bibcode:2025PaReL.189..195K. doi:10.1016/j.patrec.2025.01.031.
{{cite journal}}: Unknown parameter|primaryclass=ignored (help) - ^ Salehi, Seyed Sadegh Mohseni; Erdogmus, Deniz; Gholipour, Ali (2017). "Tversky loss function for image segmentation using 3D fully convolutional deep networks". arXiv:1706.05721 [cs.CV].
{{cite arXiv}}: Unknown parameter|primaryclass=ignored (help) - ^ Ben-Baruch, Emanuel; Ridnik, Tal; Zamir, Nadav; Noy, Asaf; Friedman, Itamar; Protter, Matan; Zelnik-Manor, Lihi (2020). "Asymmetric Loss For Multi-Label Classification". arXiv:2009.14119 [cs.CV].
{{cite arXiv}}: Unknown parameter|primaryclass=ignored (help)
Content Disclaimer
Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.
- The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
- There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
- It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
- Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
- Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.

- promotional language: see Words to watch;
- personal commentary: opinions or direct addresses to the reader;
- informal language.
Instead, only summarize in your own words a range of independent, reliable, published sources that discuss the subject.