Awesome
<h2 align="center"> OpCluster: Automatic Extraction and Hierarchical Clustering of Fine-Grained Opinions </h2> <p align="justify"> OpCluster-PT is a customized version of OpCluster for the Portuguese language. <p align="justify"> The Opcluster is an algorithm for extracting and hierarchical clustering of implicit and explicit fine-grained opinions (also called aspects from web constumer reviews. This method relies on the organization of similar implicit and explicit aspects (considering their context of use) inside a tree. For example, in the follow review: "she considers the price of camera very expensive”, here, the consumer employed the term “price” to evaluate an aspect (propriety) of camera. However, consumers may also use the terms “cost”, “value”, “investment”, "cost-benefit", etc. In addition, consumers may use implicit or explicit aspects to refer to the same aspect, e.g., “she got calls at the São Francisco river” and “working anywhere” were employed in smarphone product reviews to implicitly evaluate the aspect “signal”. It is also interest to notice that, in wide range of domains, proper names may also be employed to refer to the aspects. For instance, the proper names “Sony” and “Nikon” may be used to evaluate the “product brand” aspect of digital cameras. Hence, this task is hard! </p> </p> <h2 align="left"> HOW DO YOU USE THE OPCLUSTER? </h2>- Get the download git file folder;
- Open the file "OpClusterPT.py" (It's necessary any IDE and the Python Version 2 or 3 installed);
- Check if all the input files are in the same folder as the "OpClusterPT.py" file;
- Unzip the folders: "OntoPT.tar.xz" and "corp_xml_reli.zip";
- Run the algorithm.