6dfb - Tumblr blog

6dfb · 8 years ago

Text

CFParticipation: Re-Designing Bacon, Friday, November 17th

The Six Degrees of Francis Bacon project, in association with Carnegie Mellon University and the Folger Shakespeare Library, invites early modernists, digital humanists, designers, and students to participate in Re-Designing Bacon, a half-day, multi-site participatory event dedicated to exploring and testing a major re-design of the Six Degrees of Francis Bacon website. Taking place both online and in person on Friday, November 17th, the event will begin at 12:00 noon EST. Participants in this meetup and add-a-thon will learn how to explore the new, re-designed network and to add people, groups, and relationships to SixDegreesofFrancisBacon.com before joining with other participants to test features, log bugs, and help enrich the collaborative picture of the social networks of early modern Britain.

Goals

The Six Degrees of Francis Bacon project is dedicated primarily to the social networks of early modern Britain, 1500-1700, but in order to support scholars and students of historical social networks more broadly, the project team, with critical support from the National Endowment for the Humanities, will soon release freely available website code on Github under an Open Source License for modification and reuse. In preparation for that release, the Six Degrees project aims to highlight the major features of its year-long redesign, developed with Density Design Lab from Politecnico di Milano, and to test its functionalities. The intended outcomes are threefold: 1.) presentation of significant features of the Six Degrees of Francis Bacon redesign; 2.) medium-volume user testing; 3.) a richer picture of the networks and groups of early modern Britain.

Rationale

Recent years have seen growing interest in the intersection of digital humanities and design and in the reconstruction of historical social networks. This event builds on and seeks to contribute to both of these concerns. While the new Six Degrees of Francis Bacon interface is designed specifically for researchers of early modern Britain, it also confronts many of the challenges that humanists in general now face in the contexts of data visualization, crowdsourcing, user experience, and graphic design. For instance, how can visualizations represent the uncertainty, vagueness, and debate that are central to humanities inquiry? How can designers use visualizations to coordinate quantitative measures and human judgements? What are principled ways of reducing the complexity of social relations to visual form? How can a web interface be part of an ongoing scholarly knowledge project? And how might qualitative categories such as identity and group belonging be presented alongside quantitative measures? Engaging with such questions at a practical level, Re-Designing Bacon will be a chance for designers and digital humanists to explore and make suggestions for the soon-to-be-released website code and for early modernists to continue building a richer picture of Britain’s early modern social network.

Summary

What: Re-Designing Bacon, a meetup and add-a-thon

Who: anyone with an interest in early modern studies, digital humanities, design, interactive network visualization, history and literature, and/or historical social networks

Where: Carnegie Mellon University, Hunt Library, Room 106B; Folger Shakespeare Library Foulke Conference Room; Twitter via #ReDesigningBacon and redesigningbacon.slack.com

When: November 17th, noon - 4 PM EST

What to Bring: a laptop, goodwill, enthusiasm; optional: biographies or other scholarly books

Additional Notes:

Anyone with a confirmed email address from @cmu.edu, @folger.edu, @georgetown.edu will be able to create an account on redesigningbacon.slack.com. Others can email [email protected] to join.

For those wishing to organize local satellite events or to bring whole classes to one of the meetups, send an internet domain (e.g. @georgetown.edu) to [email protected] and we’ll be sure anyone with a confirmed email address from that domain will be able to join the Slack group.

Do you already have lists, spreadsheets, or network diagrams you’d like to see incorporated into Six Degrees? Feel free to bring them along.

Screenshots

3 notes · View notes

6dfb · 8 years ago

Text

Six Degrees of Pennsylvania: William Penn and Quaker Brokerage

By John Ladd

This week our tutorial, Exploring and Analyzing Network Data with Python, was published on Programming Historian. If you're thinking about getting started with networks or want to learn more about network metrics, we hope you'll check it out---it'll be especially helpful for learning what you can find out about a network without visualizing it.

As part of that tutorial, we published a Quaker dataset of nodes and edges, a subset of Six Degrees data, that can be used to better understand the social relationships of Quakers in the 17th century. As an extension of the insights we gain in the tutorial, I wanted to include two more important observations about Quakers that can only be discovered by examining the full Six Degrees network.

To get us started, here's a visualization of the full Six Degrees network (at greater than 60% confidence):

The first thing to notice is that this graph shows more structure than a typical large network "hairball": there's a lot of interconnection, but not so much that the visualization is useless. To make subgroups more visible, I've colored the graph using Louvain modularity, a method of community detection in networks. You can see that the nodes separate into reasonably sized groups, usually around the large monarch nodes (nodes are sized by degree, the number of connections). The smaller groups at the margins of the network are a little more interesting than the big ones that center around a king or queen and his/her court. If we zoom in on the light blue cluster at the top right, about 2 o'clock, we'll find a group that looks familiar:

The labels are hard to read at this scale, but this light blue cluster is the Society of Friends, the Quakers. The modularity algorithm picked out the Quakers without any external prompting or labeling. This brings us to our first observation.

Observation 1: Quakers make up a distinct social group within our dataset, identifiable via network measures.

Why should this matter? It could have been possible that Quakerism was more a matter of personal religious expression than social bonds and that Quakerism as such exerted no more influence on the larger network structure than, say, vegetarianism or hailing from Coventry. Yet this community detection finding shows the opposite---the group is a cohesive category detectable through shared social ties.

Of course, since the bulk of Six Degrees data is derived primarily from the Oxford Dictionary of National Biography, this cluster is as much an artifact of historiography as lived social relations (Quaker biographies being a more prominent strain in historiography than Coventry or vegetarian lives), but our statistical method is designed to extract evidence of social relationships from the history of scholarship.

In the Programming Historian lesson, we discuss what we can learn about Quakers by examining their relationships as its own separate network. Now that we know Quakers are a distinct subgroup in the Six Degrees network, what can we learn about the relationship between this group and the larger early modern social world?

A common way of considering the relation between network subgroups is finding brokers, nodes that form connections between otherwise disparate groups. If we mouseover William Penn's node on this graph, we can see that he might have acted as a broker:

You can see how Penn, the large-ish blue node in the center, connects to a number of other blue-node Quakers but also to a number of orange nodes: including King James II and other members of his circle. From the visualization, it appears that Penn sutured the Quakers to the social world of the late Restoration court. But this visual observation alone is not enough to establish that Penn was a broker.

Brokerage in networks can be determined quantitatively with a measure called betweenness centrality. Nodes with higher betweenness centrality have more shortest paths running through them and are therefore more likely to be brokers.1 We calculated betweenness centrality for every node in the network, and then pulled out only the nodes in the Quaker group and ranked them. Here is a sorted list of Quakers by their betweenness centrality:

William Penn 0.006801584690719992 James Nayler 0.004232792432448821 John Perrot 0.003832392021618369 George Fox 0.003243500088605162 George Whitehead 0.0018450596874039888 Benjamin Furly 0.0017626086726237084 Thomas Violet 0.0017189520163916239 Thomas Ellwood 0.0015495121830386375 William Goodsonn 0.0015376783949921783 John Swinton 0.0014979686448861693 Edward Burrough 0.0012875578480300996 Isaac Penington 0.0011378935216530058 Margaret Fell 0.0011178598622033282 Thomas Curtis 0.001064103520011229 John Story 0.0009842838099021632 Robert Barclay 0.0008471849302255813 John Audland 0.0007205817210995546 Gilbert Latey 0.000689612291736668 Francis Howgill 0.0006169090369192608 Anthony Pearson 0.00048656718614419585

Observation 2: William Penn has higher betweenness centrality than any other Quaker, indicating greater brokerage with other early modern social groups.2

Penn's higher betweenness centrality confirms what we saw in the visualization: that Penn, through his social connections, acts as an intermediary between Quakers and the larger political forces of the late 17th century. This dovetails nicely with what we know about Penn's historical role as a tolerationist, forming a bridge between earlier dissenting Quakers, like Margaret Fell and George Whitehead, and the Catholic, Anglican, and Presbyterian factions swirling around the courts of Charles II and James II.

Penn, of course, goes on to become a political force of his own, leveraging his connections with the king to found the colony of Pennsylvania. The Quaker State may owe its origins not only to Penn's tolerationist ideology but to his role as a broker, as highlighted by the social connections made visible through Six Degrees.

A quick caveat about path-length-based measures like betweenness centrality in the Six Degrees network. Because we know that the basis of our data (the ODNB) and our statistical inference method produce a network that is missing many of the possible historical social connections from the period, it's important not to lean too heavily on measures based on path lengths. Because we're missing so many possible paths, those measures could change drastically as we collect more data. However, in this case, because the Quaker group is so distinct from the rest of the network, we believe betweenness can tell us something important about the standout nodes in this group. ↩︎

You might be asking, "Does William Penn also have a high degree (i.e., doesn't he just know a whole lot of people)?" Yes, he does, and the two are certainly correlated. But Penn wouldn't be a broker if he simply had high degree and high betweenness centrality within the Quaker group---he also knows more people outside of the Society of Friends, making both his degree and betweenness figures more significant. That is to say, we could bolster our observation by counting Penn's connections in a few different ways. ↩︎

#network #friends #quakers #betweenness #python

3 notes · View notes

6dfb · 8 years ago

Text

Introducing: Six Degrees on Wikidata

This is a guest post by Andrew Gray

An introduction to Wikidata

Wikidata is a linked-data project run by the Wikimedia Foundation, better known for Wikipedia---which no doubt is familiar to you all.

It began as a simple spine linking together the same set of concepts in different languages, allowing us to say that the Wikipedia articles "ham sandwich" (English), "ساندویچ ژامبون" (Farsi), and "bánh mì thịt nguội" (Vietnamese) are all on the same topic, wrapped up as the enigmatically named entity Q2220900. It then developed into a more complex linked-data system, with defined relationships between these entities - for example, Q2220900 is a sub-class of entity Q28803, sandwiches, and has parts Q170486 and Q7802 - appropriately, ham and bread. These are themselves linked in to other entities, building a detailed scaffold of information. The system is fully multilingual, so we can render the relationship Q2220900 : P527 : Q7802 as "ham sandwich : has part : bread" in English, or "sándwich de jamón : compuesto de : pan" in Spanish.

Entities can also have descriptive metadata---the item knows that a ham sandwich looks like this - and can link out to external resources such as this collection of nutritional data. Finally, information can be marked as sourced or contested---most useful for things like population data, but in this case, imagine, say, that there is a dispute between two authorities about whether or not the platonic ham sandwich should include butter…

The data is all available under a Creative Commons public domain dedication, so it's all freely available for whatever wish you might want to put it to.

Thankfully, it's a bit more useful than just a database of sandwiches. Wikidata contains an entity for every distinct topic covered on Wikipedia - a very broad scope - and has the capacity to be extended to cover any other distinct topics so long as they are identifiable, significant, and reasonably well-described in external sources.

And those entities include, as of 19 May, some 3,464,213 real people (as well as another ~45,000 fictional or mythical ones, plus about 7000 horses, 500 dogs...). So, onward to biographies!

Biographical identifiers

Around five years ago, there was a good deal of work importing library authority identifiers into Wikipedia and then into Wikidata. As of late 2013, we had matched around 420,000 people against some form of external identifier - mostly VIAF, the Virtual Integrated Authority File, but around half linked to the German GND database, half to ISNI, the International Standard Name Identifier, and a third to the Library of Congress names record. This was just the start. A second wave of work focused on matching against identifiers for more useful full-text resources, most prominently the Oxford Dictionary of National Biography (thanks to the kind help of Jo Payne at OUP). There are now around a hundred external identifiers which match to over 1,000 people. These range from fairly straightforward library authority records to full-text biographical dictionaries, databases of performers, historical professional indexes, and genealogical databases. Having such a wide array of identifiers in one place means that Wikidata is slowly becoming an effective "biographical spine", tying together a wide range of different digital projects and allowing them to be cross-referenced without having to do the matching work individually for each pair.

Once we had completed the matching to the Oxford DNB, it was a relatively simple task to match Wikidata against Six Degrees of Francis Bacon---almost every item in Six Degrees is already linked to the ODNB. From the Wikidata side, projects like Six Degrees are really interesting. We currently model some basic family relationships (eg parent, child, sibling, spouse, godparent) and 'professional' relationships (successor in a role, pupil of, patron). However, Six Degrees goes massively beyond this, exposing data like "guardian", "creditor", "heir", "parishioner", "correspondent", through to nuanced material like "knew in passing", "acquaintance", "rival", "friend". It makes sense to link out to someone else who does a good job of this, rather than try and reinvent the wheel ourselves.

And now, as of this month, we now have a reliable match between 13,416 people in Wikidata and Six Degrees. (Strictly speaking, there are 13,428 entries - some of them are the same person listed twice in Six Degrees under slightly different name). This is everyone who was in Six Degrees when we initially harvested it in late 2015; I'll be looking into updating this with more recently added people in the near future to take advantage of the steady flow of new names.

Playing with the data

We can use the Wikidata matches to tell us a little more about the overlap between Six Degrees and other databases. Those 13,416 correspond to…

7700 entries in the Virtual Integrated Authority File, aggregating library authority databases

3400 entries in the University of Virginia Social Networks and Archival Context project

2150 entries in the (UK) National Portrait Gallery database - mostly sitters, some artists, some both

1900 entries in Early Modern Letters Online

1700 entries in the University of Cambridge Alumni Database

1650 entries in the (UK) History of Parliament

550 entries in the Getty Union List of Artist Names

Plus some more idiosyncratic connections -

250 people who appear in the OpenPlaques database of commemorative plaques

230 authors of works in Project Gutenberg

180 artists in the Art UK database of paintings in public collections

40 entries in the Internet Movie Database (some as characters, some as credited composers or playwrights)

These offer a range of potential to someone working with Six Degrees data. Some of them offer rich background information above and beyond the ODNB (eg the excellent History of Parliament). Others have network information that could supplement that in Six Degrees (Early Modern Letters Online). Others offer ways to find the works of a subject (Gutenberg, VIAF, ArtUK).

So far, so good. But the real fun with Wikidata comes from the fact that it has aggregated all sorts of very strange information, and so we can tie the Six Degrees data into the wider morass of twenty-six million linked items.

For example, we know of 187 people in Six Degrees who, collectively, have 504 things named after them. (Since you wondered, 32 of these are asteroids.)

We have (limited) data on places - here is a map of 2000+ people whose place of death was indexed:

And a bubble chart of the 25 most common "occupations" - as you can see, we've primarily indexed politicians and clergy

And finally, one that will be of interest to expanding Six Degrees - 238 cases of someone with a Six Degrees identifier married to someone without one.

Other potential uses

Wikidata also offers some quite quick-and-easy enhancements. For example, around 2800 of the Six Degrees people have copyright-free images of them linked to the Wikidata items. A service using Six Degrees identifiers can easily pull up images of its subjects - here is a sample of a hundred.

All these database queries are generated from a public SPARQL endpoint, which is a very powerful query tool. A list of example queries suggests the scope of things that it can produce.

If all you need is a list of identifiers, a simple report is available through the BEACON tool which generates crossreferenced lists between a given external identifier and Wikidata, or if needed, two external identifiers via Wikidata.

For reconciling Wikidata entries against an external database, we have a human-mediated crowdsourced matching tool. This is public and any external dataset can be loaded into it, with a bit of care and formatting - it does not require that a property exist on Wikidata, although ideally we'd love the matches to go back into the public dataset eventually.

If you have a dataset that you'd like us to try and integrate against Wikidata, or you can see an interesting use for some of this information - please do get in touch! We're always on the lookout for interesting data to grow the scope of the project.

Finally, I'd like to acknowledge the hard work of all the Wikidata volunteers who worked on this - most importantly Charles Matthews and Magnus Manske, who have done enormous amounts of work on the biographical identifiers project over the past few years - and also the Six Degrees team for making such an excellent project openly available.

#wikidata #lod #opensource

0 notes

6dfb · 8 years ago

Text

"Ut Tensio Sic Vis": Introducing the Hooke Graph

Or, The Return of Spring

By John Ladd

A 2004 portrait of Robert Hooke by Rita Greer, via Wikimedia Commons. Note the spring by his left elbow.

In A Description of Helioscopes and some other Instruments (1676), the natural philosopher Robert Hooke promised to publish a theory of elasticity sometime in the future and teased his readers with a frustrating (if pleasingly alphabetical) anagram: ceiiinosssttuu.

Thankfully, nothing in the current Six Degrees interface is quite so obscurantist, but there are parts of the site that have somewhat puzzling names. Right now in our visualization view, you can search for a "network," and the result is a very specific kind of graph: a force-directed network visualization. These force-directed graphs are the dominant mode of visualized networks, the collection of nodes and links that you're used to seeing everywhere. Here's our friend Robert Hooke's graph as an example:

(Go to the live version of this graph on Six Degrees)

This familiar arrangement is not the only way to display a network. Networks, after all, are made of data---edge tables or adjacency matrices that can be visualized in any number of ways (see below). And we have more visualizations in store for our redesign: circular layouts, timelines, and other schemes are in the works.

Examples from the web of other visualization layouts, from left to right: arc diagram, chord diagram, adjacency matrix, and hierarchical graph drawing1

But we quickly ran into a problem. Since we plan to present more than one type of visualization to the user, it's no longer suitable to refer to these original graphs simply as "networks." Internally, we started calling them force-directed graphs to differentiate, but this name didn't seem right for our site---it's a bit vague and potentially confusing to the newly-initiated. In a force-directed graph, what kind of force is doing the direction?

This brings us back to Hooke and his mysterious anagram. Two years after A Description of Helioscopes, Hooke published the answer to his puzzle in Lectures de potentia restitutiva, or of spring. The anagram stood for "Ut tensio sic vis," a principle that describes the behavior of springs. Basically it states that "a spring's extension or displacement from its neutral position is directly proportional to the force applied."2 This principle became known as Hooke's Law, and can be used in conjunction with other forces to produce network graphs in which there are "as few edge crossings as possible."3 That is to say, Hooke's Law helps create visualization algorithms that make network graphs readable. This is why force-directed visualization algorithms are often referred to as "spring layouts."

We found the combination of a seventeenth-century scientific principle and modern network visualization impossible to resist, so we've rechristened our force-directed graphs "Hooke graphs," in honor of the man himself. Look for the newly-redesigned Hooke graphs when the new site is released, and happy (belated) first day of Spring.

Arc diagram: screenshot from this D3 visualization by Matthew Clemens; Chord diagram: cropped screenshot of Wikiviz datatelling project, original image prepared by Jen Lowe; Adjacency matrix: from Wikimedia Commons; Hierarchical graph drawing: from Wikimedia Commons, prepared by user Tehnick with Graphviz software ↩︎

Patri J. Pugliese, ‘Hooke, Robert (1635–1703)’, Oxford Dictionary of National Biography, Oxford UP, 2004; online edn, May 2006 http://www.oxforddnb.com/view/article/13693, accessed 7 April 2017. Above information about Hooke also comes from this article. ↩︎

If you're interested in the math, I recommend this very detailed Wikipedia article which includes many good references. ↩︎

#network #science #history of science #datavisualization #dataviz

0 notes

6dfb · 8 years ago

Text

How the Bacon Gets Made

John Ladd

In my last post I gave an overview of the design sprint we held back in February; in this one I’d like to give a sneak peek of what we’ve been working on—to show you, as Chris Warren said on the final day of the sprint, how the bacon gets made. The best way to do this is to show off a few of the many whiteboards we used for brainstorming throughout the week. Here’s a selection of the messy but (we hope) endearing sketches that led us to our current batch of prototypes and the plan for our redesign.

I. User Contributions

I drew these boxes myself, so I’m at fault for the fact that they lean precariously. These three sketches represent three possible ways for a user to contribute to the site. Option (2) is the current form we provide users, option (3) is a proposal for contributions from a data table page, and option (1) is a proposal for contribution through any visualization. Below the three proposals is a list of all the actions available to users of different types—standard users, curators, and admins. This gave us a clearer picture of the options we would need through any given contribution interface. This board helped us get our heads around how we wanted users to approach contribution on the site, and made us realize that standard users of the site probably don’t need the original forms if contributions through the visualization and the table view are robust enough.

II. Contribution Mode

In this drawing, which was done by Paolo Ciuccarelli based on discussion in the Contribution Workflows group, we began to think seriously about what it would mean for users to contribute through a visualization. We would need a new visual grammar for links and nodes that had been added by the user---should we use dotted lines or a certain color to identify these?

Thinking about how to visualize edges in particular had us thinking about relationships that have been edited or added by a person versus ones that were calculated by our network inference algorithm. You can see at the bottom left that we started to think through how to differentiate these. This kind of convergence happened a lot throughout the week: while working hard on one problem we’d have a breakthrough on a related issue.

III. Visualizing Shared Groups

Our “shared group” visualization has long been waiting for its moment in the sun. Right now if you search for a shared group on the site, you get a data table rather than a visual representation. So we wanted to figure out how best to represent shared group membership. This board shows multiple attempts to come up with a visualization scheme for essentially three types of people nodes---members of only Group A, members of only Group B, and members of both groups. You can see on the right that we ran into a big problem: as users of the site enter more and more group members, would we wind up scrolling endlessly to see all the nodes? And/or would the visualization get so noisy as to not be useful?

As so often happens, by the time we came up with a solution we were almost out of space on the whiteboard. The trick turned out to be to think of nodes as something other than people. Instead of displaying a visualization for any two groups and their shared membership, we decided to display a single visualization for all groups, in which a node would be a group and an edge would be a weighted line representing how many members those groups share. This could give the user a quick sense of how various groups overlap while still having access to lists of shared group members (by clicking on an edge). Sometimes the whiteboards were as much about showing us what wasn’t working as what was.

IV. Meta-structures

By the end of the week, we had to begin thinking about how the different modules, pages, and modes we had designed would work together. The tree on this whiteboard shows the different visualization views, how they are organized, and how they might link to one another. On the right is a drawing of how these different views will be available through our search box UI. Getting all of this on a single board was an exciting moment at the end of the week---after working hard on individual problems we were finally able to see the big picture of our new site structure.

The irony of using a simple analog technology to design a complex digital interface hasn’t escaped us. For group discussion and quick iteration through ideas, there was nothing better than drawing up one’s thoughts on a whiteboard for collaboration and iteration. We’re still referring to these boards frequently as we build the next version of Six Degrees.

#design

2 notes · View notes

6dfb · 8 years ago

Text

Why DH Projects Should Try Design Sprints

John Ladd

In the past two years since the launch of our beta site, we’ve been making plans for a 1.0 release—a complete redesign of the Six Degrees web application taking into account everything we learned from the beta. We’ve developed a long list of desired new features and enhancements, some of which we’ve already incorporated into the beta site and others that we knew would have to wait until we could rethink the whole design. Thanks to our NEH implementation grant, last week we got a chance to assemble some collaborators from the wider DH community and kick off the redesign process.

But where do you begin with such an extensive task, when there’s so much to be done? What are the methods for collaborative decision-making at this scale? We knew there were many large and small decisions to make, and we knew we would need help from design and programming professionals, especially from the generous and skilled team at Density Design. It was through Density Design that we discovered the solution: a week-long Design Sprint in which we would, as much as possible, focus solely on the project for five days of discussion, design, and prototyping. In the next post, I’ll get into more detail about some of the design decisions we made and how we plan to improve the site, but first we wanted to explain the rationale for working in a sprint and why we think it’s useful for large DH projects like ours.

Six Degrees is no stranger to collaborative work environments. Our site itself is set up to capture the wisdom of the scholarly “hive-mind,” as Anthony Grafton has said, and we’ve had great success with big collaborative events, like our multi-city Networking Women add-a-thon. In some ways the Design Sprint is a sister event to the add-a-thons. Instead of bringing together people with diverse expertise to add and refine data, we brought together people with diverse skill sets for design problem-solving. Because of their brevity, sprints are perfect for quick iteration—the ability to come up with several different options, test them, and make decisions by the end of the week. To get so much done in such a short amount of time, our sprint required some careful planning.

We began with the core idea of a sprint, which comes mainly from the discipline of design. We wanted this to be an opportunity to engage with designers from the start of our process, rather than waiting to “add design” to our project at the end. We learned a lot from Paolo Ciuccarelli’s public lecture, which he gave at the start of the week as part of Density Design’s visit to CMU. Thinking with designers helped us to allow our interface to grow naturally from our central ideas and from our users’ needs. We think the growing opportunities for collaboration between design and the digital humanities is an exciting and important development in both fields. This development was the subject of Paolo’s talk, has driven Density Design’s many DH collaborations, and is the impetus behind events like the upcoming DH + Design Symposium at Georgia Tech. A sprint allows designers and DH scholars to engage on one another’s terms and to build a shared critical language. Each day of a sprint is defined by a design stage: Unpack, Sketch, Decide, Prototype, and Test. We decided as a group to take this approach and adapt it to our needs as digital humanists, as designers, as developers. What could we learn from these goal-oriented constraints? How might we accommodate the many modes of thinking needed in a large DH project (and importantly, which elements of this approach were not suited for humanities inquiry)? What were the needs of our interdisciplinary team?

The archetypal sprint assumes that the week’s starting point is the seed of an idea, but our project has long had a fully-operational site, the result of a lot of thought and decision-making (to which this blog is a testament). Our process had to build on what came before while allowing fresh ideas, but without starting over from the beginning. Therefore the first days of our sprint took a slightly different shape. The core Six Degrees team—Chris Warren, Dan Shore, Jessica Otis, Scott Weingart, and myself—first had to present the project’s main ideas and familiarize our collaborators with the existing site. So on Monday morning we began with a round of presentations on the project’s origins, its data organization style, its approach to network visualization, its expected users, and more. These presentations were punctuated by brainstorming around specific features, usually spurred by insightful questions from our Density Design collaborators—Paolo Ciuccarelli, Michele Mauri, Tommaso Elli—and from our colleagues in the wider Carnegie Mellon and Pittsburgh DH communities—Susan Tanner, Dan Evans, Adam Perer, David Newbury, and Gena Hong. This process turned out to be crucial for us, as later in the week it helped us to focus on those issues with the current design that needed the most attention.

Once we established shared knowledge we split ourselves into three working groups organized around three areas of the site we felt needed attention— the contributor workflows, the site’s data tables interface, and our network visualizations. This allowed us to work on the DH problems of crowdsourcing, data curation, and data visualization, but it also allowed us to work on the design problems of user experience, user interface, and (again, but from a different perspective) data visualization. The working groups helped us to continue our interdisciplinary work—the Density Design team and the Six Degrees team were split up evenly into each of the groups to encourage the sharing of expertise in different domains.

We treated these groups as a loose form of organization only. There was plenty of talk among the groups, and by the end of Day 2, when new ideas were starting to form, it was clear that some groups had a larger task than others. On Wednesday, the day typically intended for the convergence of multiple ideas, we came together as a single group and rethought our organization system. This was a pivotal moment for us: as each working group explained its ideas to the full group, plans took shape, and we wound up reorganizing into just two groups, one dedicated to user interface and dataviz, and the other in charge of user contributions both inside and outside of the visualizations. Having this flexibility built into our process allowed us to continually reassess our progress as it aligned with our project goals, and we certainly made it farther by being willing to reconsider our work plan.

We spent part of the last two days building and prototyping, as the normal sprint structure suggests, but we also spent time planning our longer collaboration. We’re thrilled to be entering into a months-long collaboration with Density Design, and despite how intense and productive the sprint week was, no DH project of this size could be fully redesigned in five days. Instead we discussed and sorted our ideas into a work plan for the next few months, coordinating with our new developer, David Newbury, who will be doing work on the site’s APIs and back-end while Density Design and the Six Degrees team work on the front-end. Had we focused too much on prototyping in the final days, we would have missed this crucial opportunity to extend the design sprint’s spirit of collaboration into future work on the project.

Since we devised many changes and prototypes in the final days that have not yet been fully integrated into a new site, we ended the sprint by coming full circle. We invited grad students, faculty, and staff from around CMU to see what we’d come up with in the past week, to explore our potential new interfaces, and to ask questions. While this isn’t exactly the formal user testing with which design sprints usually end, we found it valuable to see which of our ideas were working and which needed more time and consideration. We’re confident now that we have a clear roadmap for the months ahead, as we prepare to relaunch Six Degrees. Watch this space for more information on the roadmaps we drew during the sprint and how they are guiding our design process.

#design #digitalhumanities

0 notes

6dfb · 8 years ago

Text

New Six Degrees of Francis Bacon Team Member!

We’re delighted to announce that we have a new member of our team!

John Ladd, who’ll be with us full time for calendar year 2017, comes to us from Washington University in St. Louis, where he is a PhD candidate specializing in 17th-century poetry, book history, and the digital humanities. His dissertation work, on social networks and literary collaboration in the early modern period, includes a network analysis project of printed dedications in EEBO-TCP. The questions, tools, and methods for this project couldn’t be a better fit with the goals of Six Degrees. He also comes to us with digital project management experience, having worked on the Spenser Archive, EarlyPrint, and several other projects at WashU’s Humanities Digital Workshop. Coincidentally, John is also a native Pittsburgher, which makes his arrival at CMU something of a homecoming.

John’s main tasks over the next year will be (a.) enriching project data (b.) enhancing user experience (c.) integrating with other digital resources (d.) identifying and partnering with an institutional home for long-term preservation; and (e.) packaging and distributing website code so that scholars can create similar networks for different eras and regions.

In short, he’ll be leading day-to-day programming and data curation activities for the project. The funding for his position comes from a National Endowment for the Humanities Digital Implementation grant awarded in 2016.

John’s excited to get to know the wider Six Degrees community, so feel free to reach out to him by email ([email protected]), on his personal Twitter @johnrladd, or on the Six Degrees Twitter @6Bacon, which he’ll be taking over.

Welcome John!

0 notes

6dfb · 9 years ago

Text

Seeking A Postdoctoral Fellow

Thanks to a recent Digital Humanities Implementation Grant from the National Endowment for the Humanities, Carnegie Mellon University’s Department of English seeks a one-year Postdoctoral Fellow/Research Associate to lead day-to-day programming and data curation activities for Six Degrees of Francis Bacon. Six Degrees of Francis Bacon is a digital reconstruction of the early modern social network that scholars and students can collaboratively expand, revise, curate, and critique. The successful candidate will likely have a PhD in History, English, Library and Information Science, or a related discipline with demonstrated experience in web development or digital humanities.

The fellow will be housed in the Department of English in the Dietrich College of Arts and Social Sciences and work with Associate Professor Christopher Warren, Principal Investigator of the Six Degrees of Francis Bacon project. Day-to-day work will involve a disciplinarily diverse and geographically disparate team of Six Degrees collaborators, including literary historians, historians of science, librarians, statisticians, and web developers.

Job Duties

The fellow will leverage expertise in a humanities discipline and a strong technical aptitude to help fulfill five priorities of the NEH Digital Humanities Implementation Grant:

Enriching project data.

Enhancing user experience.

Integrating with other digital resources.

Identifying and partnering with an institutional home for long-term preservation.

Packaging and distributing website code so that scholars can create similar networks for different eras and regions.

Required Knowledge and Skills

Ph.D. or ABD in a relevant subfield of a humanities discipline or Library and Information Sciences.

Demonstrated ability to work collaboratively and successfully in a team-based environment.

Demonstrated willingness to learn technical programming and data curation skills.

Excellent verbal and written communication skills.

Preferred Knowledge and Skills

Experience with modern web development, system administration, databases, or programming languages relevant to the project, including R, Ruby on Rails, JavaScript, or Python.

Demonstrated experience in project management and/or digital humanities research.

Ph.D. in a relevant subfield of the humanities.

How to Apply

Please submit a cover letter, a CV with links to current/past digital projects, and contact information for three references at https://cmu.taleo.net/careersection/2/jobdetail.ftl?job=2003958.

Review of applications will begin September 16, 2016 with Google Hangout interviews likely beginning in October.

More Information:

Please visit “Why Carnegie Mellon” to learn more about becoming part of an institution inspiring innovations that change the world.

A listing of employee benefits is available at: http://www.cmu.edu/jobs/benefits-at-a-glance/index.html

Salary: $60,000

0 notes

6dfb · 9 years ago

Link

Want to look behind the scenes and see how the wonderful @6Bacon_Bot was made? Creator Dan Evans gives the scoop.

0 notes

6dfb · 9 years ago

Text

Toward a Pragmatics of Error in the Digital Humanities

Chris Warren

What’s the role of error in the digital humanities? Let me offer as a kind of parable a story I encountered about two people who had difficulty communicating (I’ve searched in vain for the source. If you know it, please let me know).

One of the people habitually mumbled. The other was older, hard of hearing. Challenging at the best of times, mutual comprehension was especially difficult for these two. The mumbler started sentences confidently but gradually lost force as he continued. His interlocutor caught snippets and sounds, but she had difficulty reassembling them into sensible utterances.

“My sister went to Cuba in the 80s,” the mumbler might say.

To the hearer, these sounds had a duration and a cadence, but sounded like, “My sister wuh-duh oo-wuh wuh-duh ay-ees.”

What were they to do?

Here’s where this starts to become a parable about error in the digital humanities. At first, the hard-of-hearing listener responded the way many people do when they can’t hear.

“I’m sorry, I couldn’t understand that, do you mind repeating it?” Problem was, this typically just initiated the same cycle once more. Repetition, incomprehension, bafflement, and on and on. After many such turns around the merry-go-round, she decided to take another tack. She’d repeat what she had understood in order to focus the mumbler’s attention on what she hadn’t.

“Your sister…?”

The mumbler would of course repeat, “…went to Cuba in the 80s.”

And yet this helped very little. The listener didn’t need repetition to resolve those fuzzy sounds. She needed enunciation. That’s when she realized the force of what I’d like to call a pragmatics of error, the strategic use of error for dialogic understanding.

What she learned to do was to intentionally misstate what she’d heard. The best way she could elicit something that made sense, it turned out, was to say something grammatically correct and aurally similar to what she’d heard but most likely—and this is very much the point—wrong.

“Your sister played tuba with her babies?”

“NO, my sister went to CU-ba in the EIGH-Ties.” Message understood.

There's a lesson here, I think, as people come to terms with some potentially disorienting aspects of digital humanities. This week, for instance, our Six Degrees of Francis Bacon team released something I could hardly have imagined I'd be associated with 10 or even 3 years ago. It's a Twitter bot that’s automatically tweeting data from Six Degrees of Francis Bacon one relationship at a time. As we’ve noted often, most of the data in Six Degrees is inferential, grounded in a probabilistic account of who knew whom in early modern Britain. Broadly, the more two names are mentioned together in the history of scholarship, the likelier we think those people knew one another in real life. But we’re altogether aware that computerized inference isn’t perfect, which is why our whole site is constructed for knowledgeable humanists to improve upon what computers get wrong.

And yet the Twitter bot makes very public what, in truth, we’ve sometimes obscured. There are many, many errors lurking in the Six Degrees of Francis Bacon data, or to put it more charitably (and probably more accurately), in the lower reaches of our likelihood estimates there’s a great sea of purely possible relationships in which lurk many truly factual ones.

An example: one of the bot’s earliest tweets was about the Independent minister John Owen and John Fell, Bishop of Oxford.

It is 47% likely that John Fell Bishop of Oxford met John Owen: https://t.co/l1OSOZyiE9

— 6° of Bacon Bot (@6Bacon_Bot)

May 12, 2016

According to our current data, it is 47% likely that they met one another. Now, any seventeenth-century historian will tell you that it's far more likely than that (Owen was in effect Fell’s boss during the Interregnum). And there’s something more than a little awkward about tweeting that deceptively specific number—47%!—into a world of many people who know better, and perhaps worse, many people who don’t.

But I return to the pragmatics of error. For now, the bot will be tweeting warts-and-all relationships at random, hourly, and there will no doubt be some howlers, because that’s what our data is. Yet there’s a dialogic community for whom Six Degrees errors can and should be provocations, invitations to resolve more clearly what for many is still a sense of the past too fuzzy, too inchoate, too etherial to wrap heads and hearts around. If earlier generations might have taken a purely censorious approach to error, today's humanists can and should take a more pragmatic approach. There often remains an infuriating gap between what true experts know and the ways that computers take up and represent that knowledge, but this gap is where slighly alien forms of humanistic scholarship like Twitter bots exist, and it is one that bots like ours seek to bridge.

In a thoughtful review of Franco Moretti's Distant Reading, Shawna Ross has invited digital humanists to

privilege openness and adopt the ad hoc playfulness of open-source programming, recognizing that systems and theories and methods are but temporary scaffolding, not the grand erection of something permanent and totalizing.

She notes righly that some "digital projects are often ends in themselves, not means," and that Moretti and other digital humanists might benefit from embracing failure more fully.

There's an ad hoc playfullness in our Twitter bot to be sure, but I confess I do still think of Six Degrees as a means--a means specifically to a fuller, more accessabile, more interconnected picture of the early modern past.

If you see something tweeted by our bot that looks off, then, kindly remember the story of the mumbler and his interlocutor, who learned--counterintuitively--that saying the wrong thing was the best way to get things right. Bots, visualization, crowdsourcing: all of these, in fact, may well have a new role to play in helping humanists make our mumbling understood.

0 notes

6dfb · 9 years ago

Text

Midwives and Poets: What’s in a Relation?

As scholars and students joined us from around the country and across the Atlantic for the “Networking Early Modern Women” event - contributing 125 new people, 265 relationships, and 502 relationship type assignments - some requested that the Six Degrees Team add a variety of relationship types to our existing list. Our existing taxonomy of relationship types accounts for parent of or spouse of or heir(ess) of. But what about midwife of, lady-in-waiting to, and even wrote a poem about? After some consideration, we added the first, modified and added a version of the second, and rejected the third. This blog post will focus on the criteria by which we decide which relationship types to add to the network, criteria that are far more complex and contestable than those for adding a person.

Our team’s had extensive and on-going debates about what relationship types should and shouldn’t be included in our social network. While these debates are to some extent still ongoing, we decided it would be useful to share our current criteria for inclusion.

Granularity

At what point do the number of relationship types in our ontology proliferate so endlessly that they lose all meaning as analytical categories? This is a question that the Six Degrees team has worried about before and will doubtless worry about again. Is it sufficient to have a category for sibling of, or do we need separate categories for full-blooded siblings, half-siblings (subdivided into legitimate and illegitimate?), and step-siblings (plus step-step-siblings?) as well? Does cousin of collapse too many different categories of cousins, implying first cousins are the same as third cousins twice removed? Put bluntly: useful categories group similar things together and differentiate unlike things. What is the balance between providing detailed relationship data and diffusing our relationship types into analytical uselessness (not to mention, forcing our users to scroll through an endless list of relationship types until they simply throw their hands up in frustration)?

Two of the requested relationship types from our Networking Women event crystallize the issue: lady-in-waiting to and midwife of. The first we quickly determined could be expanded to encompass all attendant of type relationships - thus preventing us from having to create a sprawling list of separate relationship types for maids of honor, ladies in waiting, gentlemen ushers, grooms of the stool(!), and so forth. While not completely identical, these relationships are all similar enough that we felt they could be aggregated.

The second was trickier. Early modern lives included many relationships of trade and exchange, and in theory, all commercial relationships could be included in a single relationship type. There’s no self-evident need for Six Degrees to have separate relationship types for fishmonger of, goldsmith of, and laundress of - though obviously someone building a different network specifically dedicated to studying early modern trade would need such granularity. So why break off midwife of and give this relationship its own separate type? We reasoned that the relationship between a woman and her midwife, because of the intimacy and rituals of the childbirth experience, was categorically different in nature from other trade and other commercial relationships. But this relationship continues to remind us that there are no bright lines, no places where we can simply (to quote Socrates in the Phaedrus) cut nature at its joints.

Sometimes, when users request the need for additional relationship types, what they’re really wishing for is something we understand as groups. Six Degrees distinguishes between groups and relations because they have clearly different properties. Each of us can participate in a group with someone whom we’ve never met or associated with directly, even someone who was dead long before our births. Likewise, if we associate with someone, that doesn’t necessitate that we’re both members of some group. In the interest of keeping such distinctions intact, we very much encourage users to create groups where a social relationship might overstate the case. Some good candidates for groups might be Elizabeth I’s maids of honor, all the grooms of the stool in the period, or all graduates of St. Paul’s school.

Literary vs. Social Relationships

Because so much of our historical evidence - at least for the early modern period - is literary in form, it is unsurprising that most network projects focus on the textual traces of relationships. Mapping the Republic of Letters, Cultures of Knowledge, and Circulation of Knowledge are all reconstructing epistolary networks, where correspondence both provides the evidence of a relationship and forms the relationship between two people. In such networks, a relationship type like “wrote a poem about” may well be appropriate.

However, Six Degrees has a different ambition - it is fundamentally a social network. Our minimum standard for claiming a relationship exists is that two people have met one another, even if that meeting occurs through the medium of, say, a manuscript letter rather than face to face. Written documents can thus provide evidence of that relationship, but in and of themselves do not constitute relationships in our network. Knowing that Person A dedicated a book to or wrote a poem about Person B can be taken as evidence that Person A admires Person B - or that they are enemies of each other, since not all poems are flattering. However, it does not necessarily mean that they had a social relationship. Scholars of the period know well that dedications and laudatory poems were often gambits to begin relationships with potential patrons rather than dispositive evidence for a prior relationship. We can propose the following test for whether a relationship merits inclusion in our social network: a relationship type is a social relationship only if it names the manner in which two people were associates. Subjects of an author’s poems, then, may well be characterized as a group--groups, again, are an excellent way to aggregate based on shared biographical attributes--but one would have to assess the poems (and other evidence) to determine whether the author had a relationship with his or her subject.

How and whether books provide evidence of social relationships will continue to be an issue that we must grapple with, as we begin to ingest new datasets from sources such as the English Short Title Catalog and Early English Books Online. Is there, for example, a definite social relationship between writers and their publishers? Or only between writers and printers, and between printers and publishers? Or, given how often books were published after their authors’ deaths, can we assume any personal connection at all between the names printed on a book’s title page? There will be no escaping the need for old fashioned research, but the point is that we’ll need to make some decisions at the outset about general patterns and likeliest scenarios.

For more of our initial thinking on the Six Degrees relationship ontology, see this earlier blog post by Dan Shore and this Cultures of Knowledge podcast by Chris Warren.

#networkingwomen #ontologies

2 notes · View notes

6dfb · 9 years ago

Text

Gender Inclusivity in Six Degrees

Scott Weingart and Jessica Otis

Computational methods are great at bringing voice to the historically marginalized (see Michelle Moravec or Elaine Parsons). We may never learn much about specific actors who produced few written records compared to their affluent white male counterparts, but by collecting the underrepresented together, we can hear in aggregate what’s often too quiet to discern individually.

This reconstruction is never easy, and rarely sensitive. Categorizing people dehumanizes the humanities, and when we increase the volume, we lose the nuance. We also constantly battle what computer scientists call GIGO (Garbage In, Garbage Out); analysis is only as good as the underlying data. At Six Degrees, we want our network to represent and reinforce the full diversity of early modern social ties, but a combination of historical scarcity and editorial decisions in our sources prevents the network from living up to this potential. In many ways, this mirrors the period itself: more open and egalitarian in ideal than reality. We can do something about that, but first we have to notice what’s missing.

Gender is a good place to start. For example, biographies of women represent only 5.4% of early modern entries in the Oxford Dictionary of National Biography (ODNB), which is itself the source of Six Degrees’ initial dataset. Our algorithm mining the ODNB for historical names then biases towards men even further, because women are often named in relation to the men around them, preventing our system from realizing they’re worth documenting. Several layers of bias against women (evidentiary, editorial, and algorithmic) add up, and this blog post describes how the scales are tipped before we begin balancing them this January. We’ll write a follow-up post after our Networking Women Add-a-Thon describing how it goes.

As most people in early modern Britain bore one of approximately one hundred highly gendered given names, it was possible to assign genders to most of our dataset according to their given names. A John (2000+ people), Thomas (1200+), or William (1200+) was male, while an Elizabeth (100+), Mary (100+), or Anne (75+) was female. Manual examination of original biographies helped identify the gender of the remaining people in our dataset, including the multi-named Christian Davies—also known as Catherine, Christopher, and Richard—who defied binary gender categorizations and helped motivate our tripartite division of genders. As we only know of one such individual in the dataset at present, our current analysis will work with male and female genders.

So where does that leave Six Degrees? Thankfully, even given the algorithmic bias against women’s names, we managed to get just a smidge closer to gender parity than the ODNB, in which 534 of 9,929 (5.4%) of early modern biographies are of women’s lives. By contrast, Six Degrees currently includes 13,443 names, 886 of which are women’s, resulting in women comprising 6.6% of the Six Degrees network. Not great, but at least closer.

Of course, Six Degrees, unlike the ODNB, is instantiated primarily as a network. It is a reconstruction of early modern relationships combining crowdsourced information and data from mining the ODNB. If we’re worried about gender diversity, we want to ensure not simply that the counts of names reach parity, but that the richness of connections between individuals of all genders are well-represented. It gets historiographically complex here, because while there were roughly equal numbers of men and women in early modern Britain, we can’t be certain how gender norms affected the making of social ties, and we must take care that an effort towards data equality doesn’t cover up the harsher realities of the past.

In this case, we can construct a sort of pseudo-Bechdel test to see between whom social ties exist in our dataset. As of December 2015, Six Degrees connected 13,443 names via 170,819 ties. Of those ties, 15,909 of them connected a woman to a man, and 1,052 of them connected two women. Thus 9.9% of ties involved a woman in some capacity, and 0.6% of ties connected two women. Although 0.6% sounds low, it’s still slightly more connections than is probable given the distribution of men and women in the dataset. If we were working with a complete network—in which everyone knew everyone else—only 0.4% of those ties would be between women, yet woman-woman connections actually comprise 0.6% of the total number of ties in Six Degrees.

The next obvious question is, how many ties is any given individual likely to have to others within the dataset? Broken down by gender, men connect on average to 25.7 other people (median 15 other people), and women connect on average to 20.3 other people (median 15 other people). While the averages show a noteworthy bias towards men, it is only a few incredibly well-connected male figures with many connections who skew the numbers so drastically. The equivalent medians do a better job representing the majority of individuals: men and women in Six Degrees tend to connect to the same number of people.

The same effect is seen when looking at the structural roles men and women play in the network. One measure of structural centrality in a network is eigenvector centrality, a number assigned to an individual representing their place in the global network. High values indicate very central figures in the overall network, and low values indicate an individual’s place on the network’s periphery. Men on average have an eigenvector centrality of 0.020 (median 0.013), and women have an average eigenvector centrality of 0.017 (median 0.013). Again we see a few very central early modern men skewing their gender’s numbers higher, but in general men and women show little difference when it comes to their structural centrality in the network. That’s an encouraging result. We also didn’t notice any funny business in the distribution of centralities between genders—that is, women comprise 6.6% of the most central figures, of the least central figures, and so forth. This is perfectly in keeping with the overall 6.6% representation of women.

All relationships in Six Degrees have numerical certainties, or confidence estimates, and we were a bit concerned whether the distribution of tie certainties reflected hidden gender biases. Our statistical method ascribes certainty to a connection based on its text mining of the ODNB; basically if two people are mentioned together a lot of and in many contexts, we can be reasonably certain they were connected, and that certainty decreases as those co-mentions become sparser. Or at least, that’s the broad strokes of our operating assumption. If a historian manually adds a connection, though, it’s generally a connection they know existed, so the certainty is often 100%. We were worried most of the woman-to-woman connections would be very uncertain, given how infrequently they are mentioned, but the opposite proved true (see below).

We were relieved to find that the connection certainty distribution is the same for connections between men, and connections between women and men. Connections solely between women, because they numbered so few, show an erratic but generally matching pattern. The exceptions are at the low end, where women share fewer very uncertain connections, and the high end, where they share proportionally more 100% certain connections. Historians adding manual ties make up a larger proportion of woman-woman connections explains the high proportion of 100% certainty connections, but we do not yet have an explanation for low proportion of very uncertain connections.

Lastly, because everyone loves vaguely-shaped blobs titled “networks”, and because sometimes it’s nice to see the names of people we’re actually talking about, below are a few network visualizations which show the entirety of Six Degrees with and without men, names sized by their eigenvector centrality, to give you a sense of how many more men are in the network, and the names of the most central men and women. The network is ordered roughly by time.

Everyone:

Just women:

Comparing network of men and women:

Readers who are interested in further exploring early modern networks and gender should join us for our Networking Women Add-a-thon on January 23rd, 2016. Two in-person sessions will be held at Carnegie Mellon University, in Pittsburgh PA, and the Folger Shakespeare Library, in Washington D.C. People can also participate remotely, joining in on the conversation through Twitter (#NetworkingWomen) and our Slack channel. For further details, check out our Networking Women website.

#networkingwomen #gender

3 notes · View notes

6dfb · 10 years ago

Photo

Six Degrees of Francis Bacon will be hosting an add-a-thon in January, 2016. People near Pittsburgh and Washington, D.C. are encouraged to come in person. You can also participate remotely through out Slack channel - see the Call for Participants for details.

#networkingwomen #gender

0 notes

6dfb · 10 years ago

Text

A Newbie’s Guide to Six Degrees of Francis Bacon

The following post was written by Alyson Goldsmith, an MA student in Carnegie Mellon University’s Literary and Cultural Studies program

One of the things the Six Degrees team is hoping to do is expand the data relating to women. When the team invited me to share my initial experiences as a newcomer to the site, I began my hunt by trying to expand the network around Aphra Behn (1640?–1689), one of the first female professional writers, and – even more scandalously – a possible female spy. My first goal was simply to map her relationships based on her entry in the Oxford Dictionary of National Biography (ODNB), the source used for the massive datamining that is the backbone of Six Degrees. Later, I hope to research further connections that link her to the social group of the infamous Earl of Rochester and his literary circle.

I began by pulling up the two webpages simultaneously. On one screen, I had the ODNB entry for Aphra Behn. On another, I located her already known 2nd degree relationships in Six Degrees by using the lefthand tab to search a network for “Aphra Be...”, allowing the data entry box to fill in her name and birth year. When I saw the person I was looking for, I pressed the “down arrow” and “Aphra Behn (1640)” turned yellow.

I then pressed TAB to put this information in the search bar. It’s worth emphasizing that pressing TAB is a critical step. The site needs you to select a name, not just enter one.

For this initial search, I kept the confidence level of these relationships at the automatic 60% and hit “find.” The resulting data shows Aphra Behn’s node hemmed in by a white circle.

To see just her 1st degree relationships, I single-clicked on her node.

Using the scroll function on my mouse, I zoomed in, which exposed the names on the other nodes. I could have used a laptop trackpad to zoom in. There are also zoom in and out buttons located to the righthand side of the mapping window.

Some of the people mentioned in the ODNB article were already in Six Degrees, but either had no plotted relationship with Aphra Behn, or else relationships with probabilities at low confidence levels. Others did not exist yet at all in Six Degrees. Six Degrees is very collaborative, and it is easy to suggest a new person to add to the system. First, you want to make very sure that they don’t already exist in the database. My status as a “curator” means that new people I create for the system can immediately become active. In my quest to find friends for Aphra Behn, I must admit that I mistakenly created a second Colonel Thomas Colepeper, due to a spelling error on my part. You can search to see if an entry already exists by doing the same search that I did initially for Aphra Behn. (If the person went by multiple names, it is a good idea to check all of them.) In this case, as was the case for Behn’s publisher Thomas Brown, the data entry box will read “no existing match.” You can test this out by entering the name of someone obviously not in the system.

Now you’re ready to add a new person! For this, I go down to the bottom of the lefthand dropdown menu to the tab that says “contribute” and click on “person.” You will be asked to sign into your Six Degrees account, if you have not already done so. Only the information with asterisks is required so only fill in information that you know for certain and can support through the ODNB or another scholarly resource. If you add a title – for example, “Colonel”, in the case of Col. Thomas Colepeper, the database will include all permutations of the full name for ease of searching later. Do not feel you have to include “Earl of Rochester,” 2nd Earl of Rochester,” etc. The space “historical significance” is a legacy from the ODNB and is used to describe what that person did, usually for a living. I think of it as a quick caption. For Aphra Behn, I would say something like “writer, playwright, poet.” In justification for person creation, include a link to or citation of the source you used, usually the ODNB page. As a beginning user, you would then submit this entry for approval by curators. If you are a curator, you will check the box to “approve” the entry, and double-check the info before submission. If you keep the “is active’ box checked, the entry will be added to the matrix instantly, so that you can build more relationships.

A box will appear of the righthand side, asking if you would like to add relationships. You can build from here, but I find it is easier to search to see if a relationship already exists, even at a low confidence level, just like I did before adding Thomas Brown. For example, Six Degrees might initially suppose that a relationship exists at 13% confidence between Aphra Behn and Francis Bacon (a made up example), but the relationship doesn’t appear in an initial search because the confidence threshold is set too high, or there are too many data points in question. To perform this search, go to the top of the screen and click on the dropdown that says “View Records.” Select “Relationships.” Now you can search a relationship between any two specific people. If it does not exist, you can either click on “create relationship” above the search boxes, or return to the contribute screen and select “relationship.” If you know that two people met, you enter the certainty at 100%. For now, just worry about the fact that they met. (Worry about other types of relationships later.)

If you know or can narrow the space when the relationship began or ended, include those dates, but you don’t have to. If left blank, Six Degrees will fill in that information using the birth/death dates of the individuals in question. Once you have established that these two people met, you can create new relationship “types” to indicate relatives, coworkers, friends, etc.

One thing I initially struggled with in contributing to Six Degrees was worrying over adding extraneous data to the database that would have to be cleaned up by later editors. As a humanistic project, Six Degrees tries to allow contributors to add as many different types of information about an individual or relationship as possible, which allows for people to get the most out of the data visualization. Don’t feel pressured to fill in all the data entry boxes. It’s better to stick to only the things you know for sure. In some cases, though, more data can be better. Let’s say, for example, that Six Degrees estimates a low confidence relationship between Aphra Behn and the theater manager Thomas Betterton but I happen to know that they met for sure, I can (and should) create a new “met” relationship at 100% confidence. Over time, more contributors will post their own confidence levels of this relationship, creating a more accurate picture of the early modern social network with the more people who get involved.

0 notes

6dfb · 10 years ago

Text

Announcing Six Degrees of Francis Bacon

We are pleased to announce the beta release of Six Degrees of Francis Bacon, a navigable social network of early modern Britain.

The network includes thirteen thousand early modern persons living between 1500 and 1700 whose social ties we've inferred statistically by data mining entries from the Oxford Dictionary of National Biography.

But the inferred network is just a start.

Six Degrees is built to be an evolving record of scholarly knowledge about how people were connected to each other in early modern Britain. To that end, scholars and students can easily contribute their local expertise to the global network. We know that many of you already have detailed knowledge of particular individuals’ friends, teachers, lovers, and enemies, though until now, there have been few venues to communicate such knowledge. With your help, we aim to build an expert-driven social map of early modern Britain, gathering and incorporating early modern relations for the benefit of scholars, students, and the public at large.

We’ve been working on the site for the past few years with a host of terrific programmers and statisticians, and, for the last year, with historian Jessica Otis, a post-doctoral fellow supported by the Council for Library and Information Resources and the Digital Libraries Foundation. Carnegie Mellon Digital Humanities Specialist Scott Weingart also recently joined the team. The full list of contributors would be incredibly long, but we certainly couldn’t have made the site what it is without key contributions from Lawrence Wang, Mike Finegold, Cosma Shalizi, Katarina Shaw, Raja Sooriamurthi, Chanamon Ratanalert, Rebecca Smith, Sama Kanbour, Angela Qiu, Ivy Chung, Miko Bautista, Amiti Uttarwar, TJ Olojede, Alexandra George, Emmett Eldred, Sarah Hodgson, Jordon Cox, Ruth Ahnert, and Sebastian Ahnert.

Though Six Degrees is still under development, it is fully functional. Not only can you navigate the early modern social network by clicking and double-clicking your way through the site, you can also sign in here to contribute your own knowledge to the network.

We know that finding one's way in a new website can be challenging at first, so we've posted a couple tutorials here and below: http://sixdegreesoffrancisbacon.com/tutorial

We're also hoping that professors will introduce the site to their students and potentially guide them through the process of curating the relations of a person or two. Jessica Otis has developed some teaching materials on how to integrate SDFB into the classroom, which are available here: http://sixdegreesoffrancisbacon.com/guide

If you have any questions about the site or any feedback to share, don't hesitate to contact us about it directly, whether by email to one of us or via this form: http://goo.gl/forms/J4u7kgI6h4.

Thanks so much for your time. We hope you find Six Degrees of Francis Bacon as useful and exciting as we do.

Sincerely,

Chris Warren and Daniel Shore, Co-Founders

youtube

1 note · View note