how to analyze new variables in cosscib galaxy (xsede): student guide complex social science and...

16
how to analyze new variables in CoSSciB Galaxy (XSEDE): student guide Complex Social Science and BioGeography computing variables, making interaction items, merging items within a categorical variable to make a dummy (dichotomous) variable. With these, students can detect more complicated relationships among independent variables. In development: Bayesian Causal Networks 1 Ren Feng – Sociology course material at Univ. Xiamen- expanded by Douglas R. White Presenters Eric Baum, Stuart Martin, Argonne Labs, R code Anthon Eff, Paul Rodriguez

Upload: virgil-rich

Post on 01-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

1

how to analyze new variables in CoSSciB Galaxy (XSEDE): student guide

Complex Social Science and BioGeography

computing variables, making interaction items, merging items within a categorical variable to make a

dummy (dichotomous) variable. With these, students can detect more complicated

relationships among independent variables.In development: Bayesian Causal Networks

Ren Feng – Sociology course material at Univ. Xiamen- expanded by Douglas R. White Presenters Eric Baum, Stuart Martin, Argonne Labs, R code Anthon Eff, Paul Rodriguez

2

Dependent variable v2018: Private Property & Punishment for TheftLayout of one of 50 sociology student projects at Xiamen University 2014using CoSSci Gateway UC Irvine VM and UCSD Comet HPC, dataset SCCS

3

Entering Dependent, Independent and other

variables in CoSSci Galaxy DpV V2018 Property: Importance of Private Ownership and Severity of Punishment for Theft v17 : Money (Media of Exchange) and Credit v155 : Scale 7- Money v234 : Settlement Patterns v235 : Mean Size of Local Communities v710: Social Stratification in the Local Community v727 : Importance of Agriculture in Subsistence, including Gardening v773: (No) Internal Warfare (between Communities of Same Society) v95: Political Power- Third Most Important Source

4

Entering Dependent, Independent and other

variables in CoSSci Galaxy Images here and below use Screen Shot then Insert then Photo then Picture from file

5

Click diskette lower left for CSV results ->

Then: Find “Rmodel” in your *.csv file: prepare to delete nonpredictive v234, v727, and check “To Try” in .csv that informs you can test v95 to add to your model if significant

Your new model has v155,v17,v235,v710 ,v773, but not v234,v727,v95. Now change your model. Then press and

6

Your new model, when executed, happens to show that all these variables are significant but two of them are opposite measures of money (curvilinear). Your instructor can run a crosstab with R:table(dx$v17,dx$v155). Now you’ll learn two tricks. If you make v155.d5 a dummy variable you’ll get a dichotomy 1234/5.For v17 however, you want to contrast 123/45 as a New Variable not a dummy. View the codebook for v17 versus v155 at eclectic.ss.uci.edu/~drwhite/courses/SCCCodes.htm to see why this makes sense.

These are two important operations to learn. But we are no closer to understanding causes of private property in the Standard Sample, probably because only 79 cases are coded. Here is what I mean: there may be a contradiction in the results:

7

To see the problem with this topic, D.R.White states the hypothesis that these data represent colonialism rather than evolution of private property: where the currency is completely foreign (i.e., code 4: colonial), private property is less likely, and we may be looking at smaller colonized communities in the Standard Cross-Cultural Sample.

8

In this model we’re testing whether v17.d4 (dichotomized at value 4 versus 1235) “Foreign Currency”: this tends to remove local private property along with (negative v235) plus a tendency of this removal to affect smaller communities (i.e., effects of colonialism). But bear in mind that with 79 of 186 cases coded it’s a small subsample.

9

computing variablesThe computed variables tried above were

how to define new variables in Galaxy CoSSci

10

making interaction items

11

merging items within a categorical variable to make a dummy variable

v17.d4 dichotomy as in

12

We may have figured out, then, the curvilineality as between v155.d5 and v17ge4 in the original model: one being the evolution of property, the other the colonial suppression of indigenous property given foreign currency. This might make sense given that the Hupka and Ryan coders of v2018 focused on The Cultural Contribution To Jealousy (1990) Cross-Cultural Research 24(1-4): 51-71.

13

Private Property & Punishment for Theft v2018in the context of coding for The Cultural Contribution To Jealousy

(N=78 of 186 societies)

14

Recap of how the final model was defined in CoSSci by the student (omitting potential “To Try” variable in the *.csv output). Clicking the square, round (i) and green buttons at the bottom of output 10 DEf01f general saved *.csv output, diagnostics for correcting errors in the model, and return to the panel of variables. Clicking the orange tab in the upper right light blue box saves the steps in the output as online sharable/publishable model histories. The main page of CoSSci Gateway has a 2-minute youtube explanation of the Gateway and a 20 minute youtube on the Complex Social Science Gateway by Lukasz Lacinski.

15

16

An upcoming CoSSci developmentCoSSci Complex Social & Science users will soon be able to start with Bayesian presuppositions or inferences about possible predictors of a dependent variable from one of many cross-cultural datasets on hunter-gatherers or other worldwide or regional datasets. Iterative improvements to initial models benefit from selecting “to try” variables that are possibly predictive. Diagnostics with group significance tests alert the user to opportunities in testing for Bayesian causality, which often occur after a dozen or more improvements of the model. Of models tested, about 65% result in models that pass these tests. For each of these, using HPC iterations, with imputation of missing data, subsets of variables will be tested for networks of variables identified by library(bnlearn) to show Bayesian causalities. These build on steps in developing the CoSSci Gateway for analysis of ethno-archaeological, historical empires, and world bioecological data, reviewed in our talk – wirelessly open to all for free – by Paul Rodriguez, Eric Blau, Lukasz Lacinski, Stu Martin, Rachana Ananthakrishnan, Tom Uram, Tolga Oztan, Doug White – Sept 30-Oct 1 2015 for the Complex Systems Digital Campus (CS-DC) sponsored by UNESCO and its Complex Systems Society (CSS) as an ECCS component (European Conferences on Complex Systems: http://www.ccs2015.org) and as part of the Arizona State Tempe Conference on Complex Systems, co-hosted by the Santa Fe Institute.