Implementation of Static Code Analysis to Detect Vulnerabilities in Applications Developed with the Assistance of Large-Language Models (LLM)
DOI:
https://doi.org/10.51747/energy.v15i2.15210Keywords:
Large Language Models, Code Security, Static Code Analysis, Web DevelopmentAbstract
The emergence of large language models (LLMs), such as ChatGPT and GitHub Copilot, has transformed software development, including in higher education. Students can now easily create PHP code for Laravel web applications. This research implements static code analysis with PHPStan to detect security vulnerabilities in student-developed PHP code that is likely assisted by LLMs. The analysis was performed on the full code of 28 capstone projects, focusing on student projects that demonstrated patterns consistent with heavy LLM output use. The results show that 64.16% of LLM-assisted code often neglects data sanitization, uses raw queries without parameterization, and contains vulnerable authentication logic. This study contributes to web application security literacy for students and recommends static analysis as a pedagogical and preventive tool.
References
[1] H. Tian et al., “Is ChatGPT the Ultimate Programming Assistant -- How far is it?,” Aug. 31, 2023, arXiv: arXiv:2304.11938. doi: 10.48550/arXiv.2304.11938.
[2] S. Lau and P. Guo, “From ‘Ban It Till We Understand It’ to ‘Resistance is Futile’: How University Programming Instructors Plan to Adapt as More Students Use AI Code Generation and Explanation Tools such as ChatGPT and GitHub Copilot,” in Proceedings of the 2023 ACM Conference on International Computing Education Research V.1, Chicago IL USA: ACM, Aug. 2023, pp. 106–121. doi: 10.1145/3568813.3600138.
[3] J. Savelka, A. Agarwal, C. Bogart, Y. Song, and M. Sakr, “Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?,” in Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1, June 2023, pp. 117–123. doi: 10.1145/3587102.3588792.
[4] I. R. da S. Simões and E. Venson, “Evaluating Source Code Quality with Large Language Models: a comparative study,” Sept. 22, 2024, arXiv: arXiv:2408.07082. doi: 10.48550/arXiv.2408.07082.
[5] A. Hellas, J. Leinonen, S. Sarsa, C. Koutcheme, L. Kujanpää, and J. Sorva, “Exploring the Responses of Large Language Models to Beginner Programmers’ Help Requests,” in Proceedings of the 2023 ACM Conference on International Computing Education Research V.1, Chicago IL USA: ACM, Aug. 2023, pp. 93–105. doi: 10.1145/3568813.3600139.
[6] Z. Liu, Y. Tang, X. Luo, Y. Zhou, and L. F. Zhang, “No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT,” Apr. 13, 2024, arXiv: arXiv:2308.04838. doi: 10.48550/arXiv.2308.04838.
[7] J. Savelka, A. Agarwal, M. An, C. Bogart, and M. Sakr, “Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle to Pass Assessments in Higher Education Programming Courses,” in Proceedings of the 2023 ACM Conference on International Computing Education Research V.1, Chicago IL USA: ACM, Aug. 2023, pp. 78–92. doi: 10.1145/3568813.3600142.
[8] O. Asare, M. Nagappan, and N. Asokan, “Is GitHub’s Copilot as Bad as Humans at Introducing Vulnerabilities in Code?,” Jan. 06, 2024, arXiv: arXiv:2204.04741. doi: 10.48550/arXiv.2204.04741.
[9] P. Denny et al., “Computing Education in the Era of Generative AI,” Commun. ACM, vol. 67, no. 2, pp. 56–67, Feb. 2024, doi: 10.1145/3624720.
[10] B. A. Becker, P. Denny, J. Finnie-Ansley, A. Luxton-Reilly, J. Prather, and E. A. Santos, “Programming Is Hard -- Or at Least It Used to Be: Educational Opportunities And Challenges of AI Code Generation,” Dec. 02, 2022, arXiv: arXiv:2212.01020. doi: 10.48550/arXiv.2212.01020.
[11] S. Dou et al., “What’s Wrong with Your Code Generated by Large Language Models? An Extensive Study,” July 08, 2024, arXiv: arXiv:2407.06153. doi: 10.48550/arXiv.2407.06153.
[12] A. M. Dakhel et al., “GitHub Copilot AI pair programmer: Asset or Liability?,” Apr. 14, 2023, arXiv: arXiv:2206.15331. doi: 10.48550/arXiv.2206.15331.
[13] S. Jalil, S. Rafi, T. D. LaToza, K. Moran, and W. Lam, “ChatGPT and Software Testing Education: Promises & Perils,” in 2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), Apr. 2023, pp. 4130–4137. doi: 10.1109/ICSTW58534.2023.00078.
[14] N. Perry, M. Srivastava, D. Kumar, and D. Boneh, “Do Users Write More Insecure Code with AI Assistants?,” in Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, Copenhagen Denmark: ACM, Nov. 2023, pp. 2785–2799. doi: 10.1145/3576915.3623157.
[15] “OWASP Top Ten | OWASP Foundation.” Accessed: Oct. 24, 2025. [Online]. Available: https://owasp.org/www-project-top-ten/
[16] S. Elder, N. Zahan, V. Kozarev, R. Shu, T. Menzies, and L. Williams, “Structuring a Comprehensive Software Security Course Around the OWASP Application Security Verification Standard,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET), Madrid, ES: IEEE, May 2021, pp. 95–104. doi: 10.1109/ICSE-SEET52601.2021.00019.
[17] M. L. Siddiq, J. C. S. Santos, S. Devareddy, and A. Muller, “SALLM: Security Assessment of Generated Code,” in Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops, Oct. 2024, pp. 54–65. doi: 10.1145/3691621.3694934.
[18] J. J. Wu, “Large Language Models Should Ask Clarifying Questions to Increase Confidence in Generated Code,” Jan. 22, 2024, arXiv: arXiv:2308.13507. doi: 10.48550/arXiv.2308.13507.
[19] C. Zhang, Z. Wang, R. Mangal, M. Fredrikson, L. Jia, and C. Pasareanu, “Transfer Attacks and Defenses for Large Language Models on Coding Tasks,” Nov. 22, 2023, arXiv: arXiv:2311.13445. doi: 10.48550/arXiv.2311.13445.
[20] “CWE - Common Weakness Enumeration.” Accessed: Oct. 24, 2025. [Online]. Available: https://cwe.mitre.org/
[21] R. Zviel-Girshin, “The Good and Bad of AI Tools in Novice Programming Education,” Educ. Sci., vol. 14, no. 10, p. 1089, Oct. 2024, doi: 10.3390/educsci14101089.
[22] G. Fan, D. Liu, R. Zhang, and L. Pan, “The impact of AI-assisted pair programming on student motivation, programming anxiety, collaborative learning, and programming performance: a comparative study with traditional pair programming and individual approaches,” Int. J. STEM Educ., vol. 12, no. 1, p. 16, Mar. 2025, doi: 10.1186/s40594-025-00537-3.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 ENERGY: JURNAL ILMIAH ILMU-ILMU TEKNIK

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.











