***************************
Relational Class Analysis for R
***************************

Amir Goldberg
amirgo [at] stanford.edu

(C) Copyright 2012, Amir Goldberg

This software was developed with by Jinjian Zhai, jinjian.zhai@alumni.stanford.edu. 
Portions of this code were developed with assistance of Ryan Turner and Gábor Csárdi.

This file is part of RCA-R. 

RCA-R is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your
option) any later version.

RCA-C is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA


------------------------------------------------------------------------

This is an R package using the C implementation of Relational Class Analysis (RCA). 

RCA is a method for detecting heterogeneity in attitudinal data. See Goldberg, A., AJS 
116(5): 1397-1436 (http://http://www.jstor.org/stable/10.1086/657976) for
more details about the algorithm and its applications. 


Let X be a dataset of size N x K. X must not include missing data. 
RCA finds an optimal division of X into G groups, such that each group 
of observations follows a distinctive pattern of relationships between the
K variables. Group assignment is reported as a vector M. Modularity, Q,
measures the strength of the division into groups. 

RCA-R is a C-based package that implements the RCA algorithm and is interfaced to R through a simple function. 
You will need to follow the simple steps below in order to use RCA through R. 

------------------------------------------------------------------------


TABLE OF CONTENTS


A. INSTALLING RCA-R

B. RUNNING RCA

C. OUTPUT



------------------------------------------------------------------------

A. INSTALLING RCA-R

In order to install RCA-R you will need to (1) install prerequisite software and (2) generate the R package locally. 

A. Prerequisite software:
(1) To compile RCA-C you will need to install a C/C++ compiler. We recommend
the GNU gcc compiler (http://gcc.gnu.org/). Many OS distributions include this complier. Mac users can install the gcc compiler by downloading the XCode app from the Apple app store (for free). 
(2) Install igraph either be typing install.packages("igraph") in R, or by downloading it directly from http://http://igraph.sourceforge.net/.

B. Generating the RCA package:
(1) Open a shell (by launching terminal on Mac/Linux or by launching cmd on Windows). 
(2) Change directory to the location of the RCA-R files. 
(3) Type "./build.sh" in a shell. 
If you are using a compiler other than GCC, change the build.sh accordingly (replace "gcc" with the compiler you are using). 
Uf you are using GCC, make sure that the following environment variables you are using are defined correctly:
-a- C_INCLUDE_PATH: points to the location of igraph.h
-b- LIBRARY_PATH: points to the location of igraph.so

------------------------------------------------------------------------

B. RUNNING RCA-R

1. In R type (change current directory to the location of RCA-R):
   source("RCA.R")
2. Run RCA as follows:
   result <- RCA(dataset,bootstrap,p-value)
   Where:
   - dataset: your dataset (a matrix of size N x K, where is row is one observation and each column a variable)
   - bootstrap: an integer equal or greater than 0. Represents the number of bootstraps to determine statistical significance. Default is set to 1000. A value of 0 means that mean and standard deviation will be computed based on the observed distribution without bootstrapping. Less than 100 bootstraps is highly unrecommended as it may bias your results. 
   - p-value: the value for determining statistical significant. Default is 0.05 (significance at 95%). 
  
3. Example: 
   	source("RCA.R")
	dataset <- read.csv("demo.csv",head=FALSE,sep=",")
	res<-RCA(data,1000,0.05)

------------------------------------------------------------------------

C. OUTPUT

Results include:

a. [your_returned_data]$member: Assignment vector (size 1xN)
b. [your_returned_data]$mod: Modularity (double)
c. [your_returned_data]$merge: Merge tree for the partitioning procedure (array of varying size)
