Don't wanna be here? Send us removal request.
Text
Wrapping Up Senior Seminar
In Dickinson College’s Computer Science Senior Seminar, we are responsible for contributing to an open-source project of our choice. My partner, Leo, and I decided to work on Kubernetes because we wanted to challenge ourselves with a massive code base that is frequently used in the industry. Just like we expected, working on Kubernetes was a challenge but helped us cultivate skills as a better developer.
Personally, working on a big project meant putting in time. This might perhaps be true for everything, but I was rewarded/learned equivalent to the amount of time I invested – not more and not less. Realistically, I knew that I was not going to able to add a new feature that will define the next Kubernetes version or fix a massive bug that has caused thousands of VMs to crash the past 3 years. To be frank, many issues tagged with “Good First Issues” overwhelmed me. In fact, I did even know how to replicate the bug nor understand why the issue was being caused in the first place.
However, being clueless was no excuse. As long as the issue was “reasonably” difficult, I could still get the gist by trying to understand the code, look for other issues/PR that mentioned the issue, and other issues/PRs that had similar issues. This meant time: the amount I understood was directly proportionate to the time I put in trying to wrap my head around it. Over time, I could finally get a grasp of what was causing the issue and a hypothesis on how to address it. Even though I still could not replicate the issue over time, test cases were there to guide me.
Working on Kubernetes definitely helped improve my critical thinking, problem solving, and collaborating skills. I learned how to navigate through a large code base: it wasn’t my job to understand EVERYTHING. One valuable skill I learned was to be able to navigate with abstraction: Suppose there is a function A which calls B & C and is called from D. I did not have to worry nor know about B & C to understand D in many cases. Doing this many times, I became better at predicting which functions I needed to look at and which ones I can ignore.
Leo and I also worked with another contributor who was working on the same issue. This contributor was a great help because he provided us with general instructions with how to approach the issue, which files to work on, and made a pull request that we can work off from. We bounced ideas back and forth. I learned that communicating on an open-source communication was all about trust: trusting that the other contributor knew how to approach the issue. If the other contributor or I do not know how, we can always communicate to build that trust and responsibility.
Personally, I want to be a great developer (who doesn’t?). I want to be adept and use my skills to earn a lot of money (again, who doesn’t?). I learned that open-source is a great resource where I can take advantage of the opportunity to learn while also giving back to the community. I will continue to pursue working on open-source projects in the future
0 notes
Text
My Experience With Pirating
To be completely frank, I was pirating software as long as I can remember. Knowing that there was no way my parents were going to pay $50 for the new Pokemon Game, I downloaded a cracked version of the game onto the R4 SD card, which contained 20 other pirated games. I jailbroke my iPod Touch to download more free games. I pirated Microsoft Word and PowerPoint to do my assignments in middle school. The first Mac I bought in China came installed with cracked productivity software like Final Cut Pro.
Stealing was much more evident with physical objects with limited copies, regardless of the price. I never stole from the local supermarkets even when my friends occasionally took 50 cent ice creams because I deemed it “wrong”; yet, I did not hesitate pirating $100 worth of Adobe Photoshop and Guitar Pro 6. In an environment where all my peers torrent everything and there were no repercussion except occasional trojan viruses, I never stopped to question the convenience and accessibility of pirating.
I came to realize pirating, including movies and songs, was stealing only when my ethics tear pointed out in sophomore year of high school. I did not stop torrenting after learning the unethicalness of pirating; neither did I feel guilt. My life went on the same because I fretting over what I viewed as a minor ethical issue would not save money nor give me convenience. In other words, it was not practical to be ethical.
Today, I don’t pirate. Not because I fully understand the ethical reasons behind it, but because there is no need to. The school provides access to all the necessary productivity applications, like Microsoft Word and Photoshop. I switched over to freemium games where it doesn’t require me to pay. When I needed to Parallels to install Windows on my M1 Mac and Guitar Pro 7, I paid because 1. I had no problems paying for satisfying software I use very often 2. I didn’t want to go through the trouble of browsing PirateBay.
0 notes
Text
Distributed Computing like reCAPTCHA and Duolingo
Tomorrow, our Computer Science Senior Seminar will be meeting Louis Van Ahn, inventor of reCAPTCHA and Duolingo. reCAPTCHA is used to distinguish between human (regular users) and automation through a simple test deciphering text and images; Duolingo is a language learning platform where users can practice vocabulary and grammar, and pronunciation at their own pace. However, behind the scenes, both reCAPTCHA and Duolingo are examples of crowdsourcing and distributed computing. Distributed computing is done on distributed systems, which is when different computers communicate to cooperate on tasks. The millions of different reCAPTCHA tests contribute to deciphering old texts and digitizing old books; Duolingo offers translation services based on the translation database built by Duolingo users. reCAPTCHA and Duolingo’s distributed computing model has proved to be successful: what other distributed computing applications exist out there?
Folding@home is a crowdsourced distributed computing platform created in Stanford in 2000. Folding@home aims to simulate protein dynamics – including protein folding, drug designs, and other molecular dynamics. Protein dynamics is extremely expensive computationally; to “borrow” computational power, volunteers can offer up their computer resources when the computer is idle. Each volunteered machine receives a simulation, or work to be computed. Once computed, the machine returns the simulation to Folding@home’s database server, where each volunteer’s simulations are compiled into an overall simulation. In 2020, more than 400,000 volunteers participated in Folding@home to use their idle computer resources to help analyze Covid19 protein structures. As a result, Folding@home was able to perform more than 1,000,000,000,000,000,000 operations per second – making it the fastest computer.
Similar to Folding@Home, BOINC (Berkeley Open Infrastructure for Network Computing) is a volunteer computing project -- taking advantage of volunteers' idle computer resources. Unlike Folding@Home which only focuses on protein activities, BOINC currently hosts 32 projects in mathematics, linguistics, medicine, molecular biology, climatology, etc. BOINC can also be ran on Android mobile devices and includes a credit system to validate results before compiling them because volunteers are untrusted and may potentially cheat credits. As of 2021, BOINC has over 64,000 active volunteers and over 230,000 hosts.
Though Folding@home and BOINC crowdsources computational power like reCAPTCHA and Duolingo, they are not exactly the same because the users are not making active contributions where users have to participate and stay active – in fact, it is more helpful for the users to stay as less active. FoldIt is a crowdsourcing computer game that utilizes human’s pattern-recognition and puzzle-solving abilities to study protein folding. Released in 2008 from University of Washington, FoldIt has more than 240,000 registered players and human intuition proved to be useful; a 2010 paper published in Nature mentioned that the FoldIt yielded solutions that matched or outperformed computer solutions.
It is interesting to learn how distributed computing, whether users are actively participating or not, are being used for scientific advances. Just like how Covid19 increased demand and pariticpants of Folding@home and Foldit, we should keep a lookout on how computational demands change.
Sources
https://boinc.berkeley.edu/
https://en.wikipedia.org/wiki/Berkeley_Open_Infrastructure_for_Network_Computing
https://en.wikipedia.org/wiki/Folding@home
https://foldingathome.org/about-2/?lng=en
https://www.the-scientist.com/news-opinion/crowdsourced-protein-simulation-exceeds-supercomputers-power-67423
0 notes
Text
Looking into Version Control and Version Control Services
This quarter, we were formally introduced to version controlling in our Computer Science seminar. We learned to navigate through git because git emerged as the most successful and convenient version controlling software. This made me wonder: how does other version control software, especially the more popular ones, compare to git?
According to this medium post, other popular version control software are Mercurial, CVS, and SVN.
To start off, some characteristics of Git include distributed repository model, good support for non-linear development, and project size independent. Some pros of Git include fast, efficient, convenient, easily maintainable; cross-platform; code change is easily detectable. However, Git fails to support keyword expansion, does not preserve timestamp, and history log becomes increasingly complex.
CVS, a.k.a. Concurrent Version System, is a centralized repository model released in 1990. It keeps historical snapshot of the project and uses compression technique for storage. The greatest pro of CVS is that it is very compatible with collaboration work. However, CVS fails to support signed merge tracking and does not perform integrity tracking.
SVN, or Subversion, was released in 2000 as an improvement to CVS. SVN is also centralized, supports versioned directory, supports cheap and efficient branching, authorization, and file locking. However, SVN is slower and managing repositories is complicated.
Mercurial was released in 2005 as a decentralized repository model similar to git. Mercurial is lightweight, high performance, scalable, and supports collaborative development. However, Mercurial’s downside is that partial checkouts are not allowed and it does not fare well with additional extensions.
Ideally, developers should decide which version control software to use based on the project’s needs. However, appears that the decentralized nature of git and Mercurial provides much more lightweight-ness and easy usage compared to centralized CVS and SVN. Local branching in git is an especially cheap operation that allows for intuitive, efficient, fast, and collaborative version controlling. The largest cons of Git and Mercurial, how merging can be complicated and how it becomes complex to manage large projects, also applies to CVS and SVN as well.
Git emerges superior (for most aspects). To enhance user experience, git is naturally used along with version control managing tools, like GitHub. GitHub provides functionalities and helps manage software better beyond the standard git. In additional to GitHub, enterprises also developed their own software for managing specific version controlling. Microsoft, even though they own GitHub from 2018, developed Azure DevOps which is geared towards closed enterprise software. Beyond standard version controlling, Azure DevOps offers more industry-common continuous delivery/integration (CI/CD) features including Kubernetes integration, Jenkins server support, integration to IDE, cloud build, cloud load testing, performance tracking, and machine-learning based detecting and diagnosing. The Azure DevOps’s lifecycle management tools make it a great choice for a well-developed proprietary software, which contrasts with GitHub’s emphasis on open source development. AWS CodeCommit is Amazon’s version/source control service that offers security, scalability, and ease in addition to basic functionalities of Git. Like Azure DevOps, AWS CodeCommit is geared towards company-owned-software. Regardless, Azure DevOps and AWS CodeCommit’s integration with git demonstrates how git emerged as the most powerful and prevalent version control software.
Sources: https://medium.com/@derya.cortuk/version-control-software-comparison-git-mercurial-cvs-svn-21b2a71226e4
https://aws.amazon.com/codecommit/
https://www.g2.com/categories/version-control-systems
https://www.upguard.com/blog/microsoft-visual-studio-team-services-vs-github
0 notes
Text
H/FOSS Communities in Korea
Though software is universal, they are still mostly written, documented, and distributed in English. H/FOSS (Humanitarian/Free Open Source Software) is no exception, with the most popular open source (largest code base and contributors) repositories on GitHub being projects supported or initiated by US-corporation like VS Code (Microsoft), Flutter (Google), TensorFlow (Google), React-Native (Facebook), and Kubernetes (Google). In contrast to these enormous projects, I wonder what kinds of open source projects are initiated outside the US. How are they maintained and how is the community like? As a Korean citizen, decided to explore some open source communities in Korea.
Similar to United States, big Korean tech companies maintains and lead open-source projects on GitHub.
Naver, the equivalent of Google in Korea, hosts projects with MIT, BSD, and Apache license such as:
Kapture: file format used to describe Structure From Motion and other sensor-aquired data;
Fixture Monkey: generate test instances and edge cases;
Billboard.js: JavaScript chart library based on D3.js.
Samsung, the largest conglomerate in Korea, hosts larger open source projects such as:
Veles: distributed platform for deep learning application development;
ONE: high performance, on-device neural network inference framework;
GearVR: VR rendering library for applications on VR-supported android devices.
There are some large and widely used open source projects hosted on individual responsories as well. For instance, @junegunn began:
Fzf: command-line fuzzy finder with 39.5k stars and 1.7k forks;
Vim-plug: a minimalistic vim plugin manager with 24.8k stars and 1.5k forks.
Though these projects are well-developed and innovative, both the software and the community are much smaller in general. This includes number of commits, number of issues & pull requests in progress, number of contributors, and fork & star counts. There are also less or no clear documentations on how to get started or communities to reach out to, but this is probably true for smaller projects in the United States too.
To learn more about the community, I ventured into Open Source Software Community (OSS), an organization managed by the National IT-Industry Promotion Agency of Korea “dedicated to cultivate open source demands of local and international market, as well as to build virtuous cycle of industrial ecology”. OSS hosts Korea Open Source Software Developers Lab (KOSSLab), who aims to create a prestigious community of *extremely* talented coders to “empower nation’s technical competence”.
It is really interesting to see that the open source community in Korea is driven by technologies and ambition. From my research, I could not find many projects with a humanitarian focus. Rather, it was mostly industry specific and niche. Song wrote a Medium article in 2019 where he listed and briefly analyzed the 100 most popular open source projects in Korea. Artificial Intelligence played a huge role with many projects incorporating deep learning, TensorFlow, NLP, and neural network. Python was also the most used language, appropriate for an AI-focused community. Even OSS seemed put more emphasis on development aspect rather than the community aspect.
References:
1: https://www.upgrad.com/blog/open-source-repositories-github/
2: https://opensource.com/article/19/5/projects-south-korea
3: https://medium.com/supple/%ED%95%9C%EA%B5%AD-%EC%98%A4%ED%94%88%EC%86%8C%EC%8A%A4-%ED%94%84%EB%A1%9C%EC%A0%9D%ED%8A%B8-top-100-739dafc082cf
4: https://www.oss.kr/
5: https://www.oss.kr/en_oss_frontier_lab
0 notes
Text
First Post
First post! This account will be used for Dickinson’s senior seminar (https://dickinson-comp-491-492.github.io/website/01-BlogSlackWikiGit.html) to post reflective writing on reading and discussion topics related to social, ethical and legal issues in technology.
For starters, my plan after college is to work in the industry for a few years. I am returning to the place I interned at and if I want, I will pursue grad school in the near future to expand my interests.
1 note
·
View note