Chris Svajlenka's musing on web development and things.
Don't wanna be here? Send us removal request.
Text
I was unsure if console output in AWS Lambda was asynchronous or synchronous. I investigated and found out.
I was recently fine tuning the performance of a large Retailer’s lambda that calculated discounts on a commercetools cart. As part of that investigation, I wondered if the important pieces of the function were being blocked by logging statements. In case you didn’t know, console.info() can be a synchronous call in certain contexts. This means if you log via the console object, you could be blocking your code from execution while the log finishes sending to stdout. More info here: https://nodejs.org/dist/latest-v12.x/docs/api/process.html#process_a_note_on_process_i_o In my lambda I was calling that all over the place with large blobs of JSON to log! Yikes, I only have 2 seconds allowed for this function to execute, so every ms matters!
A quick test on Lambda was done via
console.info(JSON.stringify({ isTTY: Boolean(process.stdin.isTTY)},null,2));
I triggered my lambda, checked the log, and was relieved to see "isTTY": false.
This meant my calls were asynchronous and I could log all I wanted to without worrying too much about performance being affected.
I hope all the other FaaS providers are ensuring that stdout and stderr calls are asynchronous as well.
Use this blog post as a reminder to check your applications to see if your logging is blocking or not!
0 notes
Text
Continuous Deployment with Github Actions
Recently Github released their tooling for automation for projects. I got a chance to dive in and play with Github actions for a real project during October. I went all out, automating a full ci/cd pipeline for a python project that used pipenv and Terraform, deploying to google app engine. I was able to set up the pipeline deploying opened PRs to a review environment and removing them when the PR was closed. Then when PRs were merged to develop, master or a versioned tag was made, I deployed to dev, staging or production. Linting and tests were automated as well, in a separate workflow.
Pain points
None of the tooling around running workflows locally worked with the v2 syntax at the time of this writing. This caused me to push a lot of commits when debugging my workflows.
Workflows do not trigger events that trigger workflows. This ruined a lot of my plans to keep pieces of code separate. I wanted one workflow to build, one to create deployments, one to respond to the deploy event to actually deploy the thing, and so on. I ended up having to do a lot of things in that build step.
Lack of support for yaml aliases. Again, ruining my ability to avoid repeating myself
actions/bin was removed by github while I was building my workflow. At the time, the Terraform recommended workflow relied on this action and I had to modify things slightly.
No scoped secret support - It would be great to have secrets per environment (staging envs, dev envs, etc).
No defined environments support in github - I had to define these in my workflow.
Overall, Github actions are a very promising feature. I’m looking forward to the general release!
#github#github actions#cicd#continuous deployment#continuous delivery#continuous integration#automation#devops
0 notes
Text
Today I learned (curl edition)
I learned today that curl supports ftp (from https://superuser.com/questions/323214/how-to-upload-one-file-by-ftp-from-command-line):
curl -T my-local-file.txt ftp://ftp.example.com --user user:secret
0 notes
Text
Quick tip for users of sequelize and yarn
I ran into a small annoyance with the use of sequelize-cli and yarn while working on a project. yarn by default will clean up a whole bunch of files from node packages. One of those files is gulpfile.js, a file that sequelize-cli will need to function. If you’re running into an error “No gulpfile found” when running sequelize, comment out the gulpfile.js entry in your .yarnclean file, run yarn, and then you should be fine to run sequelize again.
0 notes
Text
Coldfusion Time Zone Fun!
I’ve been dealing with a lot of weird time-scheduling fun with coldfusion. Essentially I was building a feature that allowed users to pick an hour and minute to get a daily reminder of either today’s appointments or tomorrow’s. This was shortly after our application switched our database to postgresql, and I got excited about the cool data types available in postgres. Unfortunately, CF’s ORM doesn’t play nice with those, so I couldn’t use them. Anywho, after doing some pretty gross stuff data-wise to handle this feature, reports came in of reminders not coming in on time. I tracked it down to where I was checking to see if we’ve already sent this reminder out in the past 24h or not. I had stored that time in the database as UTC. Now, I had already converted the server time for now() to utc using dateConvert('local2utc', now()); I then used datediff() to determine the number of hours between now and the time the reminder was last processed. When the times looked like they were 24 hours apart, datediff returned 20 hours and was causing some trouble. It turns out, because I had done that conversion, CF was nice enough to remember the original time it came from and used that in the datediff calculation. I had assumed Dates in CF were not TZ-aware. I was wrong! Anyway, to make CF forget about the TZ that the servertime came from, a simple toString() around the date worked.
Lesson learned: CF Dates can be TZ-aware.
0 notes
Text
Don’t always trust your libraries to do what you expect.
I recently implemented chunked or “multi-part” uploading to a video uploader. As part of that I made use of dropzone (this was already there from work I had done earlier) and resumable.js (the glue to make support chunked uploading). This was a great win, users with spotty internet connections could now upload large files without worry of the upload failing halfway through. As part of my testing for this feature, I would start an upload, disconnect from the internet, wait, and reconnect and ensure the upload finished. Hooray!
This made it out to production and everything seemed okay. However, a few weeks after it was rolled out we got a few reports of “This file just wasn’t working when we uploaded it”. Initially it seemed a lot of those reports were due to encoding and our transcoder not playing nice with it. Eventually a report came in that one of the videos that was uploaded, when played back, was missing the last few seconds of video! Uh oh.
This immediately smelled of something going foul with either the chunking, or the rebuilding of the file server-side. We reached out to our transcoding service and learned we were missing a few parameters in the request we were making prior to the actual upload. Well, cool, I went about starting to fix that when I noticed that the exact number of chunks I was uploading was off by one.
It turns out that resumable.js will, by default, will set the last chunk to upload’s size as in between n to 2n-1 (n being the chunk size). https://github.com/23/resumable.js/issues/51 Is the issue where some unfortunate soul discovered this odd behavior. Anywho, the company behind resumable.js decided that because of their specific use-case around the last chunk’s size, they wanted this to be the default behavior. This caused me some headache, but at least they stuck in the forcechunksize option in there to override their silly default.
So, lesson learned, always check to be sure that library you’re trusting to work, actually works the way you are expecting it to.
0 notes
Text
Undefined iterator in a for loop - lolColdfusion
I ran into this fun little gem about a week ago and remembered I should share it with the world:
public function errorout() { // I know what will cause an error! totalLen = 0; var derp = ['thing','thing2', javaCast('null',0), 'anotherthing']; for (var x in derp) { totalLen += len(x); } } errorout();
This will throw an exception: Variable X is undefined.
Isn't that wonderful? This is because of the way coldfusion "handles" null variables. Utterly terrible. If you ever are doing something similar (iterating over an array where an element might be null), you can avoid some headache by adding this bit at the beginning of your loop:
if (isDefined('x')) { continue; } //change x to the name of your index.
0 notes
Text
Fixing broken Scheduled Tasks in the CF 10 admin
Today I ran into a fun bug with the CF admin. You might inherit a Coldfusion server, and it might be broken, and you might now know it.
I found out my development instance of CF was broken when I went into the CF admin and tried to view the scheduled tasks page. This is what I got:
Yay! I don’t even have a E:/cf_10/ directory on this linux machine! How is this possible!
After some googling it looks like this is a bug that has been reported to Adobe: https://bugbase.adobe.com/index.cfm?event=bug&id=3575011 and https://bugbase.adobe.com/index.cfm?event=bug&id=3621124. The solution? Fix your (probably) corrupted lib/neo-cron.xml file and restart CF.
Below is a sample blank neo-cron.xml file as CF is not cool enough to know how to fix this file if its empty (this was my case) or doesn’t exist (I tried this too).
<wddxPacket version='1.0'> <header/> <data> <array length='4'> <struct type='coldfusion.server.ConfigMap'> </struct> <boolean value='true'/> <string></string> <string>log,txt</string> </array> </data> </wddxPacket>
After resetting your neo-cron.xml file, you should be able to restart CF and see a very blank, yet functioning, scheduled tasks page in the admin interface.
0 notes
Text
Second level Hibernate Caching for Coldfusion on AWS using Hazelcast.
If you’re unfamiliar with Hibernate, or Coldfusion, feel free to skip over this post.
I work on a coldfusion 10 application. Recently we’ve made the push to have distributed sessions shared in redis. In the past, the site used ehcache as a second level cache for Hibernate. This was fine when the application ran on only one server. Now there are three and this caching was turned off. I had the fun task of figuring out how to enable a distributed l2 Hibernate cache. Oh how much fun I had!
In the documentation for this feature, http://help.adobe.com/en_US/ColdFusion/10.0/Developing/WSCAD9638E-2D2C-48d8-9069-AE5A220B75A6.html#WS4C3B91C4-E209-4449-B1EE-E44F4F5D3D14. They list a few cache providers:
ormsettings.Cacheprovider This setting defines the cache provider that needs to be used for secondary cache. This defaults to EHCache. The other values for this setting are JBossCache, OSCache, SwarmCache and Hashtable. You can also specify the fully qualified class name of the cache provider.
Great! They have given me some suggestions on what I could use! A little bit of googling later and I found out that OSCache & SwarmCache are dead, Hashtable is not intended for production use and is not safe for clusters, and Ehcache is not “cluster safe”. Well that last bit about Ehcache is not true, Ehcache became part of Terracotta’s enterprise solution: Bigmemory Go or was it Bigmemory Max. Either way, the open source version of it was limited to a single server and failover. This would not suffice. Oh, and the other option listed for a cache provider, JBossCache, is now infinispan. Great. This caused plenty of confusion. Maybe if I were using the most up to date version of CF their documentation could have pointed me in the right direction. I don’t know, but this was terrible.
So, I do more googling and think “Maybe ehcache with replication will work” I tried to set this up and everything looked okay *except* replication. I was dumbfounded. I had no idea why it wasn’t working, I checked to see that multicast was enabled on the interfaces I was using. I tried so very hard to use the ehcache-debugger jar. Sidenote, I had plenty of fun inspecting the jar to see what other jars it was looking for and going out and finding them -- classpath fun. Once I resolved the classpath errors with the debugger I just ran into something about my defaultcache not supporting statistics.
So I gave up on replicated ehcache.
More googling later, I found out that Hazelcast and Infinispan both are open source Hibernate l2 cache providers. Infinispan clusters are set up via Jgroups, something I tried out earlier with ehcache’s replication and had no luck with it, so I decided to pass on that and started out with Hazelcast.
While getting hazelcast set up, I noticed they have a <aws> section in their default config file. My interest was piqued as our servers are in AWS. I looked up what this did, and found out:
Configuring EC2 Auto Discovery
Hazelcast either uses Multicast or TCP/IP for discovery, but EC2 does not support multicast. To configure Discovery using TCP/IP, you need the IP addresses upfront and this is not always possible. To solve this problem, Hazelcast supports EC2 auto discovery, which is a layer on top of TCP/IP discovery.
So *that’s* why my replication stuff didn’t work. EC2 just flat out does not support multicast. Good to know. Well I’m glad Hazelcast has a work around.
After loading up the jar and setting the cacheProvider to
ormsettings.cacheProvider = "com.hazelcast.hibernate.HazelcastCacheRegionFactory";
I found out that it looks like CF’s version of hibernate didn’t like this, and was expecting some class that could be cast as a CacheProvider. The Hazelcast documentation points to using com.hazelcast.hibernate.provider.HazelcastCacheProvider as the class name. I tried setting this and got a no class found error. Yay! It looks like that class was removed in the Hazelcast 3.x branch. Thankfully, the 2.x branch was still being updated and I downloaded that and set it up accordingly and it worked! Even the aws autodiscovery! I could see the same objects in the cache across nodes and everything! Yay!
0 notes
Text
Redis and Coldfusion, because why not?
The company I work has been growing. They now have 3 web nodes serving up their application. This is a good thing. However, having 3 web nodes is causing strain whenever a deployment needs to happen. Why? They are not storing sessions distributed across web nodes. They relied on sticky sessions at the load balancer to keep users tied to a particular web node. During a deployment, individual nodes would be updated, and during this, users would be kicked off (logged out) of one web node and sent to another while the deploy occurred. This is bad and makes deployments a rough time.
Now, I’m not a stranger to storing sessions in some kind of cache that all web nodes can access. I’ve helped implement this by hand with php + memcached, and also have used Drupal’s memcached module. When looking at how the ColdFusion community handles this need, it was not very clear what backend they preferred. A lot of the research I was doing ended up pulling up results on how to enable the second level ORM cache -- we’ll get to that in another post. I was more focused on distributing the sessions at the tomcat level since we were using J2EE sessions. There was talk about how it’d be nice to use something open source like memcached or redis to store session data in, because we could later on use that for application-level caching. I took this idea and went with it. I found out there was a great redis tomcat library: https://github.com/jcoleman/tomcat-redis-session-manager. I went and compiled it and set it up on my dev instance right away.
Then I ran into serialization errors. As it turned out, our CF application was doing something not-that-great: storing components/hibernate objects in session variables. Those things aren’t serializable, apparently. I wrote some code to essentially export the object to a simple struct, store that in the session variable, and to load the associated hibernate objects on each request by their primary key stored in that session variable struct. This way we stored some nice happy serializable session variable data in redis. Happy day!
Wrong! I nearly forgot about one more thing that happens when you move away from a single web node -- handling requests on just-uploaded-files.
A few areas of the application did some things to files that were just uploaded in the last request. A good example of this is import functionality with a preview screen.
I initially wanted to set up glusterFS to provide a shared temporary filesystem for the web nodes, but because of the overhead of getting that set up (I know it’s not really all *that* much) and the time in which we wanted to roll this new functionality out, that setup was shelved to happen later. In the meantime we set up s3fs to set up a temporary file system using Amazon S3. Once we got that set up, everything was good to go.
I am really really thankful that the tomcat-redis-session-manager was a thing. We’re now using it in production and are investigating using redis as the backend for our ORM second level cache.
0 notes
Text
Resolving similar git conflicts made easy
Have you ever ran into the problem where you seem to be resolving the same merge conflicts over and over in git? Have you gotten tired of resolving them the same way each time? Git already has the answer for you, git rerere. This command will allow you to record git merge conflict resolutions, and then git will automagically apply them for you if it runs into the same conflict! This is pretty magical and I hadn’t found out about it until a few coworkers ended up resolving the same conflicts over and over again. I did a quick bit of googling and found rerere.
Rerere actually stands for "reuse recorded resolution" and can help save you time if you’re stuck resolving that same darned conflict more than once.
In a nutshell you can enable it via git config --global rerere.enabled true and it will automatically save conflict resolutions and attempt to reapply them for you. Sometimes you don’t want these resolutions to be reapplied, so either disable rerere, or, you can checkout the conflicted version of a file after a merge that was resolved via rerere via git checkout --conflict=merge file.txt You could then resolve the conflict in some other fashion.
Further reading:
https://git-scm.com/blog/2010/03/08/rerere.html
https://www.kernel.org/pub/software/scm/git/docs/git-rerere.html
0 notes