Netlytic provides three types of accounts. Tier 1 is ideal for exploring Netlytic’s capabilities; Tier 2 is great for small class projects; Tier 3 is ideal for large research projects. Please see our account types page for further details on account features.
If you are new to Netlytic tier 1 is great for exploring text and network analysis features, while tier 2 is an effective option for small projects and class assignments. Both of these tiers are free of charge. Tier 3 is a community- supported option for those who require access to collect for numerous large datasets.
Netlytic is committed to maintaining free access for Tier 1 & 2 accounts. However, collecting and analyzing millions of data points from social media requires a lot of computing power. We rely on a community supported approach to assist with infrastructure costs and ensure Netlytic runs smoothly and securely through a commercial web hosting company. If you like Netlytic, please support the hosting of our project by upgrading to “Tier 3". Below is more information about each Tier.
Note: For larger datasets (>100,000 records per dataset), a dedicated solution might be required. Please contact us at email@example.com
|This is a default tier||Request a free upgrade by logging in to your account and clicking on the "My Account" page|
|Max # of Datasets||3||5||300|
|Max # of Records/Dataset||2500||10000||100000
|Great for exploring what Netlytic can do!||Great for smaller projects and class assignments!||Great for larger research projects and brand management tasks!|
You can automatically upgrade from a Tier 1 to Tier 2 account free of charge by logging in and visiting the “My Account” tab, then click the “request an upgrade” link.
The limit was set based on our empirical testing of the current technology for network data visualization. With datasets larger than 100k, the network structure becomes too dense to draw any meaningful conclusions. Also the computational complexity of visualizing such large networks is very high and it would take few hours per visualization and a lot of computing resources to complete the task. Therefore, a more balanced approach (from the empirical as well as computational sides) is to split your dataset into smaller periods of time and perform the analysis on each of the subsets separately. This approach will also enable you to draw conclusions about changes in communication networks and actors over time. Tip: You can also split your dataset(s) manually based on a period of time using the built-in feature (look for the ‘scissors’ button under My Datasets) or exporting/downloading your dataset and splitting it in Excel, and then re-uploading it to Netlytic.
Platforms & Data Collection
Netlytic uses APIs (application program interface) to collect data from each import source. In the case of Twitter and Instagram, each API is required to authenticate each user (data collector). The process of authentication is used for two primary reasons: first, that the collector has permission; and second that the collector does not exceed the number of records allowed for collection, as each API has a specified limit. Netlytic takes the user through this authentication process when you link a Twitter and/or Instagram account to your Netlytic account. Please note that Netlytic will never post anything to these social media accounts, it only allows users to collect from these platforms. Tip: some researchers create a separate account just for data collection purposes
Netlytic can pull data from 7 different sources including: Twitter, Facebook, Instagram, YouTube, RSS Feeds, as well as .csv or .txt files from Dropbox and Google Drive. When importing data from social media sites, Netlytic is using an API to collect publically available data. For instance, with Facebook data, users can only import data from open and public pages, groups, events, etc, Netlytic cannot import data from any private conversations or closed groups. Please see below for specific requirements and information per platform
|Data source||Request Frequency||Max Records Per Request||Account Linking Required?||Note|
|Twitter ||every 15 minutes||up to 1000||Twitter requires to link your Twitter account to use this importer.||This importer uses the Twitter REST API v1.1 search/tweets endpoint.
This returns a collection of relevant Tweets matching a specified query.
Please note that Twitter's search service and, by extension, the Search API is not meant to be an exhaustive source of Tweets. Not all Tweets will be indexed or made available via the search interface.
Twitter API rate limit allows about 10 active collectors per user.
Typically tweets older than a week will not be returned.
|Facebook||hourly||up to 2500||Facebook does not require to link your Facebook account to use this importer.||This importer uses the Facebook Graph API v2.2.
This returns posts and replies from public Facebook groups, pages, events, or profiles.
It returns up to 100 top level posts to/from a page, as well as up to 25 replies per post.
Replies to replies are not included.
|Instagram||hourly||up to 10000||Instagram requires to link your Instagram account to use this importer.||This importer uses one of the following Instagram API v1 endpoints, depending on the query type:
Get a list of recently tagged media. Note that this media is ordered by when the media was tagged with this tag, rather than the order it was posted.
Search for media in a given area (5km radius). The time span is set to 5 days.
Please note: Instagram may return return publicly shared photos from users with otherwise private profiles. See Documentation.
|Youtube||once||Youtube does not require to link your Youtube account to use this importer||This importer uses the YouTube Data video comments feed API v2.0|
|RSS||daily||No required||This option allows to import records using Really Simple Syndication (RSS) feeds.|
|Text file / Google Drive||once||No required||This option allows you to import messages from a text or CSV file by uploading files to Netlytic.
Note: If your dataset includes more than one text file, you will need to upload and import one file at the time.
1) CSV file (delimiter = a comma; enclosure = a double quotation mark; escape = a backslash). The first line should include columns' names.
2) Full-text transcript with the headers:
Date: Sun, 1 Apr 2007 14:10:17 -0400
Subject: Origin of the term "Internet" ?
I would prefer to not have to do it, but each time I try to submit a course paper without it capitalized, I get the paper back marked up by the professors, telling me it is capital I- internet.
At this time Netlytic can only create Instagram inquiries based on a single search term (e.g.#picoftheday) or coordinates (e.g. 43.718241,-79.378058). For Twitter, you can use the "FROM" search operator to collect tweets from a particular user. For example, from:realdonaldtrump
Netlytic users can only import one search term per query if importing directly from Instagram. As an alternative, you could collect for several individual search queries, download and amalgamate into a single CSV, clean and eliminate duplicates, and then re-upload to Netlytic for further analysis
Yes, you can create a subset while the data is still collecting as well as once the collection period has ended. To do this you can create a subset on your dataset home screen (clicking on the date stamp or the scissor icon). Additionally, if your dataset has over 10,000 records, you can access the scissor icon under the dataset’s text analysis tab. A new window will open with a calendar, here you will select the dates for the new subset.
Netlytic will collect the link of pictures posted with an Instagram message, and these links will be available through the csv file. You will also be able to preview any images that accompany messages in the Network Analysis visualization when you explore individual nodes.
Since the APIs are not case sensitive, you should get the same results whether your search query has capitals or lower-case letter.
Netlytic only collects publically available that is made available through the platform’s public APIs. If someone mentions a “private” account in their message, the account “name” will appear in the network visualization because of the mention by another user (name network). The only known exception to us is the case of Instagram API which may return publicly shared photos/comments from users with otherwise private profiles. Here is what Instagram states on their API documentation page: “If someone with a private profile shares a photo or video to a social network (like Twitter, Facebook, Foursquare and so on) using Instagram, the image will be visible on that network and the permalink will be active. In other words, the photo will be publicly accessible by anyone who has access to its direct link/URL.” PLEASE NOTE: It is the responsibility of every researcher to determine an appropriate level of their data anonymization and data abstraction when reporting/presenting their results to the public.
Twitter limits the number of live collections you can setup in Netlytic. It varies based on specific queries you plan to run, but on average, you should be able to run up to 15 simultaneous collections.
Once you download your dataset from Netlytic, the “pubdate” column shows the date and time when the message was posted in the Atlantic time zone. You can confirm this by going to the URL listed in the "link" column to view the post in question directly on the web; specifically, you can compare the date/time shown online in your own time zone versus what’s in the spreadsheet that you downloaded from Netlytic. For example, the following tweet is shown in Netlytic as posted on Feb 12, 2018 at 2:19pm. When you see this tweet on Twitter, you can confirm when it was posted in your local time zone. If you are in the Eastern time zone, you will see it posted at “1:19pm”, which is an hour behind the Atlantic time zone shown in the spreadsheet. You can use the following online tool to help you with the time conversion between different time zones. https://www.timeanddate.com/worldclock/converter-classic.html Finally, whenever available (when provided by Twitter), Netlytic records poster’s local time zone in the “postertimezone” column. The value in this column shows the UTC time offset in hours. For example, the poster of the above mentioned tweet is located in North Carolina, USA, which is in the Eastern time zone and currently is not observing the Daylight Saving time. So the UTC offset for this state is "-5" hours: https://www.timeanddate.com/time/zone/usa/north-carolina
Visualizations and Image Exporting
Clusters are determined by Netlytic’s algorithms, and the nodes (which represent individuals) in the visualization are grouped based on a unique characteristic, for instance, a cluster could be based on geographic location. Each cluster is given a different colour to help users distinguish between groups they are examining. This is especially useful with larger and dense networks.
There are two ways you can export your work in Netlytic.
- Exporting Data: You may export the dataset, or the raw data, as a csv file. To do this, navigate to your dataset home screen by logging into Netlytic. Locate the dataset you would like to export and click on the download icon. In the pop-up window, click on the csv image to begin the download.
- Exporting Images: You may wish also to export images of your text and network visualizations. For any of the text visualizations (word cloud, words over time, and categories) you will need to take a screen shot. To capture your network visualizations, begin by visualizing the network. Along the left hand panel at the button, click the “save image” button. The network image will now be saved in this panel. You can download this image to your compute r by first clicking on this icon and in the next popup screen, right click on the image and select save as.
You may notice with some dataset, that emojis appear in the world cloud visualization, however analyzing emojis specifically can be unreliable.
The stacked graph provides a visual representation of how popular topics within the conversation change over time. The x-axis represents a specific time period (e.g. June 1 to September 29th), while the y-axis illustrates the popular topics.
Understanding that some projects require complete anonymity, for instance, in the case of removing user names from the text and network visualizations, we have put together a few suggestions you may want to consider: • You may choose to blackout usernames in screen shots of visualizations or dataset • When using the text analysis visualizations, within the word cloud, you can remove any usernames by clicking on the red x button beside each word. This process will remove usernames in the words over time visualization. In the network analysis, along the left hand side panel, you may disable the “node labels” button to remove any usernames appearing on the visualization.
As each research project is unique, it’s hard to determine an “ideal” sample size. However, you may wish to keep the following in mind, when determining the appropriate sample size for your research: • What is your research question (e.g. changes over time need to have a defined time period that would be long enough to show changes - this is up to the researcher as well as depends on the area studies)? • Is there specified criteria outlined by the publication venue? • What type of analysis are you conducting: qualitative vs. quantitative studies? • The availability of data: some topics will have more data. If you are looking at sample of all possible records, you will need to justify whether the sample is representative or random. Once you see the same posters are coming back to the “community” this may indicate you’ve reached saturation point. It’s also important to be aware of external factors that influence posting behaviour (e.g. marketing campaigns, particular events, etc). Ideally you would like to collect data for whole duration of the event and if possible before and after (same period). This is so that you can make conclusions whether user behaviour are influenced by these external factors.
Once the Netlytic user is authenticated, only publically available social media posts are being collected. It is up to the individual researcher to determine whether there is a need and/or expectation to inform social media users before, during or after data collection. Netlytic does not manage an informed consent process, nor, other form of consent as it is outside the scope of the tool.
Gruzd, A. (2016). Netlytic: Software for Automated Text and Social Network Analysis. Available at http://Netlytic.org
Check headers are properly formatted. Our Text File import page outlines the headers needed to import data into Netlytic by way of either a .csv or .txt file. Formatting also required that headers are all in lower case.
Download the CVS file from Netlytic. Open Excel and import the CSV through the “Data”->”From Text” option. You will need to select the file origin as “Unicode (UTF-8)”. You can also export your dataset as an Excel file which is more robust to handle non English characters, new lines, emojis and other special characters in tweets.
Please note it may take up to 15 minutes for the first import to get data. If it is not there, there are a couple of possible reasons:
- First, check that the Facebook group/page that you are collecting is public (meaning that it does not require anyone to be logged in to view it).
- Second, if you are collecting posts from a Facebook group, make sure that you are using its Facebook ID instead of the group’s name/URL. You can find the group ID at http://lookup-id.com. Paste the Group's ID in the Facebook Page Code/Search field in Netlytic.
- Finally, due to some inconsistencies with Facebook API, it may take a couple of tries to get data from Facebook. We suggest to check the box titled “Enable data collection from this Facebook search every hour for the next 1 day(s)” before you start your data collection.