• Login
    View Item 
    •   DSpace Home
    • Faculty/Staff Scholarship
    • College of Engineering and Applied Sciences
    • View Item
    •   DSpace Home
    • Faculty/Staff Scholarship
    • College of Engineering and Applied Sciences
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Effects of Some Design Factors on the Distribution of Similarity Indices in Cluster Analysis

    Thumbnail
    Date
    2015
    Author
    Albatineh, A.
    Khan, H. M.
    Zogheib, Bashar
    Type
    Journal Article
    Peer-Reviewed
    Metadata
    Show full item record
    Abstract
    This article investigates the effects of number of clusters, cluster size, and correction for chance agreement on the distribution of two similarity indices, namely, Jaccard and Rand indices. Skewness and kurtosis are calculated for the two indices and their corrected forms then compared with those of the normal distribution. Three clustering algorithms are implemented: complete linkage, Ward, and K-means. Data were randomly generated from bivariate normal distributions with specified means and variance covariance matrices. Three-way ANOVA is performed to assess the significance of the design factors using skewness and kurtosis of the indices as responses. Test statistics for testing skewness and kurtosis and observed power are calculated. Simulation results showed that independent of the clustering algorithms or the similarity indices used, the interaction effect cluster size x number of clusters and the main effects of cluster size and number of clusters were found always significant for skewness and kurtosis. The three way interaction of cluster size x correction x number of clusters was significant for skewness of Rand and Jaccard indices using all clustering algorithms, but was not significant using Ward's method for both Rand and Jaccard indices, while significant for Jaccard only using complete linkage and K-means algorithms. The correction for chance agreement was significant for skewness and kurtosis using Rand and Jaccard indices when complete linkage method is used. Hence, such design factors must be taken into consideration when studying distribution of such indices.
    URI
    http://hdl.handle.net/11675/883
    External link
    http://www.tandfonline.com/doi/abs/10.1080/03610918.2015.1082586?journalCode=lssp20
    Collections
    • College of Engineering and Applied Sciences [148]

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsTypeThis CollectionBy Issue DateAuthorsTitlesSubjectsType

    My Account

    LoginRegister

    DSpace software copyright © 2002-2022  DuraSpace
    DSpace Express is a service operated by Atmire