Self-reported quality of life following stroke: a systematic review of instruments with a focus on their psychometric properties
Quality of Life Research - To evaluate the psychometric properties of common health-related quality-of-life instruments used post stroke and provide recommendations for research and clinical use...
The development and psychometric properties of oral health assessment instruments used by non-dental professionals for nursing home residents: a systematic review - BMC Geriatrics
Background Globally, oral health status of the geriatric population residing in nursing homes is poor. The integration of non-dental professionals is vital to monitor oral health, early identification and triaging of oral health problems, and timely referral to dental professionals. The aims of this systematic review were to provide a summary on the development and characteristics of oral health assessment instruments currently used by non-dental professionals for nursing home residents, and to perform a critical appraisal of their psychometric properties. Methods This review was conducted as per the PRISMA guidelines. CINHAL (EBSCO), Medline (Ovid), and EMBASE (Ovid) were searched systematically. Two reviewers independently screened the title, abstract, and full text of the studies as per the eligibility criteria. Studies describing oral health assessment instruments used to assess oral health of nursing home residents by non-dental professionals were included. Using a methodological framework, each instrument was evaluated for purpose, content, and psychometric properties related to validity, reliability, feasibility, generalisability, and responsiveness. Additionally, the reporting quality assessment of each included study was performed according to the SURGE guidelines. Results Out of the 819 screened articles, 10 studies were included in this review. The 10 identified instruments integrated 2 to 12 categories to assess oral health, which was scored on a 2 to 5-point scale. However, the measurement content varied widely, and none were able to comprehensively measure all aspects of oral health. Three measurement approaches were identified: performance- based assessment, direct inspection of the oral health status, and interview measures. Only eight instruments provided quality assessment on the basis of validity, reliability, feasibility and generalisability, whereas three instruments- Brief Oral Health Status Examination, Dental Hygiene Registration, and Oral Health Assessment Tool reported good methodological quality on at least one assessment criteria. Conclusions None of the instruments identified in this review provided a comprehensive assessment of oral health, while three instruments appeared to be valid and reliable. Nonetheless, continuous development of instruments is essential to embrace the complete spectrum of oral health and address the psychometric gaps.
Disorder- and Treatment-Specific Therapeutic Competence Scales for Posttraumatic Stress Disorder Intervention: Development and Psychometric Properties - PubMed
Although the assessment of therapeutic competence in psychotherapy research is essential for examining its possible associations with treatment outcomes, it is often neglected due to high costs and a lack of valid instruments. This study aimed to develop two therapeutic competence scales that assess …
Psychometric properties of implementation measures for public health and community settings and mapping of constructs against the Consolidated Framework for Implementation Research: a systematic review - Implementation Science
Background Recent reviews have synthesised the psychometric properties of measures developed to examine implementation science constructs in healthcare and mental health settings. However, no reviews have focussed primarily on the properties of measures developed to assess innovations in public health and community settings. This review identified quantitative measures developed in public health and community settings, examined their psychometric properties, and described how the domains of each measure align with the five domains and 37 constructs of the Consolidated Framework for Implementation Research (CFIR). Methods MEDLINE, PsycINFO, EMBASE, and CINAHL were searched to identify publications describing the development of measures to assess implementation science constructs in public health and community settings. The psychometric properties of each measure were assessed against recommended criteria for validity (face/content, construct, criterion), reliability (internal consistency, test-retest), responsiveness, acceptability, feasibility, and revalidation and cross-cultural adaptation. Relevant domains were mapped against implementation constructs defined by the CFIR. Results Fifty-one measures met the inclusion criteria. The majority of these were developed in schools, universities, or colleges and other workplaces or organisations. Overall, most measures did not adequately assess or report psychometric properties. Forty-six percent of measures using exploratory factor analysis reported >50 % of variance was explained by the final model; none of the measures assessed using confirmatory factor analysis reported root mean square error of approximation (0.95). Fifty percent of measures reported Cronbach’s alpha of 0.40). Twenty-five percent of measures reported revalidation or cross-cultural validation. The CFIR constructs most frequently assessed by the included measures were relative advantage, available resources, knowledge and beliefs, complexity, implementation climate, and other personal resources (assessed by more than ten measures). Five CFIR constructs were not addressed by any measure. Conclusions This review highlights gaps in the range of implementation constructs that are assessed by existing measures developed for use in public health and community settings. Moreover, measures with robust psychometric properties are lacking. Without rigorous tools, the factors associated with the successful implementation of innovations in these settings will remain unknown
Measuring characteristics of individuals: An updated systematic review of instruments’ psychometric properties - Cameo Stanick, Heather Halko, Kayne Mettert, Caitlin Dorsey, Joanna Moullin, Bryan Weiner, Byron Powell, Cara C Lewis, 2021
Background: Identification of psychometrically strong implementation measures could (1) advance researchers’ understanding of how individual characteristics imp...
A review of the content and psychometric properties of cancer-related fatigue (CRF) measures used to assess fatigue in intervention studies
Supportive Care in Cancer - Cancer-related fatigue (CRF) is a common and debilitating consequence of cancer and its treatment. Numerous supportive care interventions have been developed to...
Functional, motor, and sensory assessment instruments upon nerve repair in adult hands: systematic review of psychometric properties - Systematic Reviews
Background Outcome after nerve repair of the hand needs standardized psychometrically robust measures. We aimed to systematically review the psychometric properties of available functional, motor, and sensory assessment instruments after nerve repair. Methods This systematic review of health measurement instruments searched databases from 1966 to 2017. Pairs of raters conducted data extraction and quality assessment using a structured tool for clinical measurement studies. Kappa correlation was used to define the agreement prior to consensus for individual items, and intraclass correlation coefficient (ICC) was used to assess reliability between raters. A narrative synthesis described quality and content of the evidence. Results Sixteen studies were included for final critical appraisal scores. Kappa ranged from 0.31 to 0.82 and ICC was 0.81. Motor domain had manual muscle testing with Kappa from 0.72 to 0.93 and a dynamometer ICC reliability between 0.92 and 0.98. Sensory domain had touch threshold Semmes-Weinstein monofilaments (SWM) as the most responsive measure while two-point discrimination (2PD) was the least responsive (effect size 1.2 and 0.1). A stereognosis test, Shape and Texture Identification (STI), had Kappa test-retest reliability of 0.79 and inter-rater reliability of 0.61, with excellent sensibility and specificity. Manual tactile test had moderate to mild correlation with 2PD and SWM. Function domain presented Rosén-Lundborg score with Spearman correlations of 0.83 for total score. Patient-reported outcomes measurements had ICC of 0.85 and internal consistency from 0.88 to 0.96 with Patient-Rated Wrist and Hand Evaluation with higher score for reliability and Spearman correlation between 0.38 and 0.89 for validity. Conclusions Few studies included nerve repair in their sample for the psychometric analysis of outcome measures, so moderate evidence could be confirmed. Manual muscle test and Rotterdam Intrinsic Hand Myometer dynamometer had excellent reliability but insufficient data on validity or responsiveness. Touch threshold testing was more responsive than 2PD test. The locognosia test and STI had limited but positive supporting data related to validity. Rosén-Lundborg score had emerging evidence of reliability and validity as a comprehensive outcome following nerve repair. Few questionnaires were considered reliable and valid to assess cold intolerance. There is no patient-reported outcome measurement following nerve repair that provides comprehensive assessment of symptoms and function by patient perspective.
Reliability, validity and relevance of needs assessment... : JBI Evidence Synthesis
all stages of the disease. In addition, they often indicate that health care providers insufficiently attend and adapt to their multiple needs. A systematic and patient-centered assessment is needed to address this lack of knowledge and understanding. However, existing quantitative needs assessment questionnaires are limited in terms of psychometric testing. Qualitative measures are time-intensive and difficult to conduct on a large scale, with growing economic pressure. Information about the methodological quality and the characteristics of needs assessment instruments are crucial for clinicians and researchers to make informed decisions about the most reliable and valid tool for their specific purpose. Inclusion criteria: This review considered studies on multidimensional needs assessment instruments for informal dementia caregivers living at home. Psychometric studies or other types of studies with sufficient data to evaluate methodological quality were included if they considered at least one outcome for reliability or validity. Methods: Studies in English, French or German and published until February 2019 were searched in four databases: Embase, MEDLINE, CINAHL and PsycINFO. After screening the titles, abstracts or full texts for eligibility, the provisional included studies were assessed for methodological quality with a standardized tool for systematic reviews of measurement properties. After data extraction using a standardized tool, the quality of the measurement properties was rated and compared using predefined quality criteria. Results: Eighteen articles covering 14 different needs assessment instruments were included in the review. Eleven publications focused on the development or the evaluation of an instrument. In addition, a development report, a manual and five studies, not aimed primarily at validation but containing sufficient information about the development or the evaluation of the used instruments, were included. The systematic evaluation of the instruments revealed that half of them had excellent content validity. In contrast, structural validity was rarely examined, and mostly with an insufficient sample size or a questionable analysis. None of the instruments had optimally tested and good internal consistency. Regarding reliability, test-retest agreement was rarely tested and inter-rater agreement was evaluated using controversial procedures. Comparing the different instruments reviewed, the “Partnering for better health – living with chronic illness: dementia” had the best psychometric evidence, and the “Questionnaire of consultation expectations” was also partly supported, while most other instruments presently had limited psychometric soundness. Conclusions: Despite the good evidence for some psychometric properties, further developments in the field of needs assessment for informal dementia caregivers are needed, particularly regarding structural and construct validity, as well as test-retest reliability and sensitivity to change. To enhance conceptual clarity, the development of an underlying theoretical model of needs should be prioritized....
Collective efficacy in soccer teams: a systematic review - Psicologia: Reflexão e Crítica
Collective efficacy, defined as a group’s shared belief about its conjoint capability to organize and execute courses of action, plays a pivotal role in understanding the dynamics of sports teams, since it influences what individuals choose to do as team members, how much they invest in motivational terms to perform actions, how much they work collectively, and for how long they persist despite failure. Through a systematic review, it was investigated how collective efficacy has been assessed in the context of soccer and which indicators, attributes, and psychometric properties have been contemplated in the instruments used. Following the PRISMA guidelines, 22 articles were retrieved through electronic databases (APA PsycINFO; SPORTDiscus; Science Direct; BVS; Web of Science; Scopus; PubMed; and Scielo), using as descriptors, in English, Spanish, and Portuguese, collective efficacy and soccer, combined by the Boolean operators AND and OR. The study did not delimit the initial year of publication for the searches carried out, including all articles found until January 14, 2021 (date of the last update). The following eligibility criteria were adopted: scientific articles published in journals; original studies, which specified the instrument used to assess collective efficacy and carried out with soccer athletes. Five instruments (FCEQ, CEQS, CEI, CEC, and CEQsoccer) that evaluated technical-tactical and psychological attributes associated with collective efficacy in soccer players were identified. In most studies, psychometric properties were restricted to content validity and reliability (internal consistency), and there were no suitable validation processes for the instruments used to measure collective efficacy, which can be considered a limiting factor for understanding this psychological construct in soccer modality.
Psychometric Properties of Suboptimal Health Status Instruments: A Systematic Review
Background: Suboptimal health status (SHS) measurement has now been recognized as an essential construct in predictive, preventive, and personalized medicine. Currently, there are limited tools, and an ongoing debate about appropriate tools. Therefore, it is crucial to evaluate and generate conclusive evidence about the psychometric properties of available SHS tools. Objective: This research aimed to identify and critically assess the psychometric properties of available SHS instruments and provide recommendations for their future use. Methods: Articles were retrieved by following the guidelines of the PRISMA checklist, and the robustness of methods and evidence about the measurement properties was assessed using the adapted COSMIN checklist. The review was registered in PROSPERO. Results: The systematic review identified 14 publications describing four subjective SHS measures with established psychometric properties; these included the Suboptimal Health Status Questionnaire-25 (SHSQ-25), Sub-health Measurement Scale Version 1.0 (SHMS V1.0), Multidimensional Sub-health Questionnaire of Adolescents (MSQA), and the Sub-Health Self-Rating Scale (SSS). Most studies were conducted in China and reported three reliability indices: (1) the internal consistency measured by Cronbach’s α value ranged between 0.70 and 0.96; (2) the test–retest reliability; and (3) the split-half reliability coefficient values ranged between 0.64 and 0.98, and between 0.83 and 0.96, respectively. For the values of validity coefficients in the case of SHSQ-25 > 0.71, the SHMS-1.0 ranged from 0.64 to 0.87, and the SSS ranged from 0.74 to 0.96. Using these existing and well-characterized tools rather than constructing original tools is beneficial, given that the existing choice demonstrated sound psychometric properties and established norms. Conclusions: The SHSQ-25 stood out as being more suitable for the general population and routine health surveys, because it is short and easy to complete. Therefore, there is a need to adapt this tool by translating it into other languages, including Arabic, and establishing norms based on populations from other regions of the world.
Psychometric properties of cognitive screening for patients with cerebrovascular diseases A systematic review
Screening instruments are ideal for acute clinical settings because they are easy to apply, fast, inexpensive and sensitive for specific samples. However, there is a need to verify the psychometric properties of screening in stroke patients.This study ...
Psychometric Properties of the EQ-5D for the Assessment of Health-Related Quality of Life in the Population of Middle-Old and Oldest-Old Persons: Study Protocol for a Systematic Review
Introduction: Health care interventions for middle-old and oldest-old individuals (75 years or older) are often economically evaluated using the EuroQol questionnaire (EQ-5D) to measure health-related quality of life. However, the psychometric performance of the EQ-5D in this population has been questioned, as it probably does not adequately capture relevant aspects of quality of life in the older population. Because the results of economic evaluations using the EQ-5D often guide decision-makers, it is important to know whether the EQ-5D has satisfactory psychometric properties in the middle-old and oldest-old population. Therefore, studies assessing the psychometric properties of the EQ-5D in this population should be synthesized by a systematic review.Methods and Analysis: A systematic review of studies providing empirical evidence of reliability, validity, and/or responsiveness of the EQ-5D in a sample with a mean age ≥75 years will be conducted. The databases PubMed, Web of Science, and EconLit will be searched. In addition, reference lists of included studies will be hand-searched. Two independent reviewers will select studies and assess their risk of bias with the COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) Risk of Bias checklist. Relevant data will be extracted by one reviewer and cross-checked by a second reviewer. Potential disagreements in any phase will be resolved through discussion with a third person. The guidelines ...
Assessment of content validity and psychometric properties of VISA-A for Achilles tendinopathy
A recent COSMIN review found that the Victorian Institute of Sports Assessment–Achilles tendinopathy questionnaire (VISA-A) has flawed construct validity. The objective of the current study was to assess specifically the process of how VISA-A was constructed and validated, and whether the Danish version of VISA-A is a valid patient-reported outcome measure (PROM) for measuring the perceived impact of Achilles tendinopathy. The original item generation strategy for content validity and the process for confirming the scaling properties (construct validity) were examined. In addition, construct validity was evaluated directly using several psychometric methods (Rasch analysis, confirmatory factor analysis (CFA), and multivariable linear regression) in a cohort of 318 persons with Achilles tendinopathy with symptom duration groups ranging from less than 3 months to more than 1 year of chronicity, and a group of 120 healthy persons. We found that the item generation and item reduction in the original construction of VISA-A was based on literature review and clinician consensus with little or no patient involvement. We determined that 1) VISA-A consists of ambiguous conceptual item themes and thus lacks content validity, 2) there was no thorough investigation of the psychometric properties of the original version of VISA-A, which thus lacks construct validity, and 3) rigorous direct assessment of the psychometric properties of the Danish VISA-A revealed inadequate psychometric properties. In agreement with the COSMIN study, we conclude that when used as a single score, VISA-A is not an adequate scale for measuring self-reported impact of Achilles tendinopathy.
Using Expert Panels to Examine the Content Validity and Inter-Rater Reliability of the ABLLS-R
Journal of Developmental and Physical Disabilities - The assessment literature cites several instruments used to assess the skills of children with an autism spectrum disorder (ASD) diagnosis, but...
Full article: The MacCAT-CA and the ECST-R in Competency to Stand Trial Evaluations: A Critical Review and Practical Implications
There is debate regarding the utility of standardized instruments in the assessment of competence to stand trial (CST). Though the field generally has a positive view of the second-generation nomot...
Intolerance of Uncertainty Scale-12: Psychometric Properties of This Construct Among Iranian Undergraduate Students
BackgroundUncertainty intolerance (IU), the tendency to think or react negatively toward uncertain events may have implication on individuals’ mental health and psychological wellbeing. The Intolerance of Uncertainty Scale-12 (IU-12) is commonly used across the globe to measure IU, however, its’ psychometric properties are yet to be evaluated in Iran with a Persian-speaking population. Therefore, the purpose of this research was to translate and validate the IU-12 among Iranian undergraduate students.Materials and MethodsThe multi-stage cluster random sampling was employed to recruit 410 Iranian undergraduate students (260 females) from the Azad University to complete the IU-12, the Depression Anxiety Stress Scale-2, and the Penn State Worry Questionnaire in a cross-sectional design. In this study, face validity, content validity, construct validity, and concurrent validity were measured and Construct Reliability (CR) and Cronbach’s alpha were used to measure reliability.ResultsThe impact score of the translated IU-12 indicated acceptable face validity (value of impact score was greater than 1.5). The value of Content Validity Index (CVI) and the value of Content Validity Ratio (CVR) were above 0.7 and 0.78, respectively. The values of CVI and CVR indicated the items had acceptable content validity and were deemed essential to the measure. The measurement model analysis showed the measure with two subscales had good fit indices (CMIN/df = 2.75, p < 0.01, RMSEA = 0.07, TLI ...
Measuring Violence Against Children: A COSMIN Systematic Review of the Psychometric and Administrative Properties of Adult Retrospective Self-report Instruments on Child Abuse and Neglect - Bridget Steele, Lakshmi Neelakantan, Janina Jochim, Lynn M. Davies, Mark Boyes, Hannabeth Franchino-Olsen, Michael Dunne, Franziska Meinck, 2023
Valid, meaningful, and reliable adult retrospective measures of violence against children (VAC) are essential for establishing the prevalence, risk factors, and...
Systematic Review of the Psychometric Performance of Generic Childhood Multi-attribute Utility Instruments
Applied Health Economics and Health Policy - Childhood multi-attribute utility instruments (MAUIs) can be used to measure health utilities in children (aged ≤ 18 years) for economic...
Comparison of content and psychometric properties for assessment tools used for brain tumor patients: a scoping review - Health and Quality of Life Outcomes
Aims To determine the most frequently utilized functional status assessment instruments for patients with brain tumors, compare their contents, using the International Classification of Functioning, Disability and Health (ICF), and their psychometric properties. Methods A scoping review was conducted to explore possible assessment instruments and summarize the evidence. A systematic literature search was performed for identification of the frequently used functional assessment tool in clinical trials in PubMed, ScienceDirect, and ProQuest databases. The content of most used instruments was linked to the ICF categories. The psychometric qualities of these assessment tools were systematically searched and analyzed. Results Nine most used assessment tools in clinical trials were identified. The most frequently used assessment instrument is the Karnofsky Performance Scale, which is developed for a general assessment of oncological patients. Out of four self-assessment tools, two were disease-specific (EORTC QLQ-BN20 and FACT-Br), EORTC QLQ-C30 has been shown good psychometric properties in patients with brain tumors as well as in patients with various oncological diseases, similar to the SF-36, it is used in patients with brain tumors as well as in patients with various diseases. The Functional Independence Measure and the Barthel Index were two objective assessment tools that described functioning, but two were neuropsychological tests (MMSE and Trial Making Test). Two hundred eighty-three meaningful concepts were identified and linked to 102 most relevant second-level categories covering all components of the ICF. Forty-nine studies reporting psychometric properties of those nine assessment tools were identified, indicating good reliability and validity for all the instruments. Conclusion Nine most frequently utilized functional status assessment instruments for patients with brain tumors represent all components of the ICF and have good psychometric properties. However, the choice of the tool depends on the clinical question posed and the aim of its use.
Comparison of content and psychometric properties for assessment tools used for brain tumor patients: a scoping review
To determine the most frequently utilized functional status assessment instruments for patients with brain tumors, compare their contents, using the International Classification of Functioning, Disability and Health (ICF), and their psychometric properties.A ...
Development and psychometric properties of the Digital Difficulties Scale (DDS): An instrument to measure who is disadvantaged to fulfill basic needs by experiencing difficulties in using a smartphone or computer
Today, some individuals may be at a disadvantage by experiencing difficulties in using a smartphone or computer to reach specific outcomes (e.g., looking for a job, searching for information on insurances) or in general (e.g., not knowing how to change the settings of an app or website). The aim of this study is to develop and examine the psychometric properties of a new instrument, called the Digital Difficulties Scale (DDS). A multi-phase method was performed to develop the questionnaire in the period from January 2019 to November 2019. The item pool was generated based on a literature review, informal observations and interviews. Then, this item pool was presented both to experts (n = 6) and non-experts (n = 492) to assess content and face validity. In a second stage, construct validity (both exploratory and confirmatory), convergent and divergent validity, internal consistency, and test-retest reliability of the questionnaire were tested. These analyses were based on a representative sample (n = 1000), and an independent sample for test-retest reliability (n = 44). Twenty-four items were generated and refined during content and face validity assessment. The exploratory factor analysis revealed three factors (Specific Digital Difficulties, General Digital Difficulties, and Worries about Future Digital Difficulties) containing sixteen items, together explaining 73.03% of the observed variance. The confirmatory factor analysis proved adequate model fitness. Both convergent and divergent validity were good, and internal consistency was excellent, with Cronbach’s alphas ranging between .93 and .97. Finally, our instrument demonstrated good test-retest reliability, with interclass correlation coefficients between .73 and .86. Consequently, the DDS can be used both in future research and practice, as it is a valid and reliable instrument to measure who is disadvantaged to fulfill basic needs by experiencing difficulties in using a smartphone or computer.
Measuring characteristics of individuals: An updated systematic review of instruments’ psychometric properties - Cameo Stanick, Heather Halko, Kayne Mettert, Caitlin Dorsey, Joanna Moullin, Bryan Weiner, Byron Powell, Cara C Lewis, 2021
Background: Identification of psychometrically strong implementation measures could (1) advance researchers’ understanding of how individual characteristics imp...
Quality of Life Research - A well-defined and reliable patient-reported outcome instrument for COVID-19 is important for assessing symptom severity and supporting research studies. The InFLUenza...
What Do You Think You Are Measuring? A Mixed-Methods Procedure for Assessing the Content Validity of Test Items and Theory-Based Scaling
The valid measurement of latent constructs is crucial for psychological research. Here, we present a mixed-methods procedure for improving the precision of construct definitions, determining the content validity of items, evaluating the representativeness of items for the target construct, generating test items, and analyzing items on a theoretical basis. To illustrate the mixed-methods content-scaling-structure (CSS) procedure, we analyze the Adult Self-Transcendence Inventory, a self-report measure of wisdom (ASTI, Levenson et al., 2005). A content-validity analysis of the ASTI items was used as the basis of psychometric analyses using multidimensional item response models (N = 1215). We found that the new procedure produced important suggestions concerning five subdimensions of the ASTI that were not identifiable using exploratory methods. The study shows that the application of the suggested procedure leads to a deeper understanding of latent constructs. It also demonstrates the advantages of theory-based item analysis.
Empathy: Assessment Instruments and Psychometric Quality – A Systematic Literature Review With a Meta-Analysis of the Past Ten Years
Objective: To verify the psychometric qualities and adequacy of the instruments available in the literature from 2009 to 2019 to assess empathy in the general population.Methods: The following databases were searched: PubMed, PsycInfo, Web of Science, Scielo, and LILACS using the keywords “empathy” AND “valid∗” OR “reliability” OR “psychometr∗.” A qualitative synthesis was performed with the findings, and meta-analytic measures were used for reliability and convergent validity.Results: Fifty studies were assessed, which comprised 23 assessment instruments. Of these, 13 proposed new instruments, 18 investigated the psychometric properties of instruments previously developed, and 19 reported cross-cultural adaptations. The Empathy Quotient, Interpersonal Reactivity Index, and Questionnaire of Cognitive and Affective Empathy were the instruments most frequently addressed. They presented good meta-analytic indicators of internal consistency [reliability, generalization meta-analyses (Cronbach’s alpha): 0.61 to 0.86], but weak evidence of validity [weak structural validity; low to moderate convergent validity (0.27 to 0.45)]. Few studies analyzed standardization, prediction, or responsiveness for the new and old instruments. The new instruments proposed few innovations, and their psychometric properties did not improve. In general, cross-cultural studies reported adequate adaptation processes and equivalent psychometric indicators, though there was a lack of studies addressing ...
Psychometric properties of self-reported measures of health-related quality of life in people living with HIV: a systematic review - Health and Quality of Life Outcomes
Objective To identify and assess the psychometric properties of patient-reported outcome measures (PROMs) of health-related quality of life (HRQoL) in people living with HIV (PLWH). Methods Nine databases were searched from January 1996 to October 2020. Methodological quality was assessed by using the Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) Risk of Bias Checklist. We used the COSMIN criteria to summarize and rate the psychometric properties of each PROM. A modified Grading, Recommendations, Assessment, Development, and Evaluation (GRADE) system was used to assess the certainty of evidence. Results Sixty-nine studies reported on the psychometric properties of 30 identified instruments. All studies were considered to have adequate methodological quality in terms of content validity, construct validity, and internal consistency. Limited information was retrieved on cross-cultural validity, criterion validity, reliability, hypothesis testing, and responsiveness. High-quality evidence on psychometric properties was provided for the Medical Outcomes Study HIV Health Survey (MOS-HIV), the brief version of the World Health Organization's Quality of Life Instrument in HIV Infection (WHOQoL-HIV-BREF), 36-Item Short Form Survey (SF-36), Multidimensional Quality of Life Questionnaire for Persons with HIV/AIDS (MQoL-HIV), and WHOQoL-HIV. Conclusions The findings from the included studies highlighted that among HIV-specific and generic HRQoL PROMs, MOS-HIV, WHOQoL-HIV-BREF, SF-36, MQoL-HIV, and WHOQoL-HIV are strongly recommended to evaluate HRQoL in PLWH in research and clinics based on the specific aims of assessments and the response burden for participants.