I don't see any reason why not, though I am not an expert on the optics of gun scopes. I know they use several lenses, just like digital cameras as discussed in that video you link to above; but if you can reduce the blurring, by better focusing using a single flat lens, then the stack of lenses you would need could be made shorter.
I suspect there are probably other options that would work at least as well. Compressed-sensing is one that comes to mind. Let me do a short digression to explain how that works:
One way you can photograph a scene is to use a 1 pixel camera, where you raster-scan (like how the old cathode ray tube TVs worked) zig-zag left-to-right and right-to-left down a field of view, picking up one pixel's worth of image data each time you change where that 1 pixel camera is pointing. That will require you to make 1 million single pixel scans to get 1 million pixel's worth of image data (say, a 1000 x 1000 pixel array), which means you would have to make it move very quickly.
However, as most people know, there is a lot of redundancy in images -- you can often compress an image 100-fold with almost no loss of content (e.g. JPEG compression). The problem is: how can you get that 1 pixel camera to do 1/100 the amount of work, thereby working 100 times faster, to give a good 100-fold compressed version of the image?
One way this can be done is to send the image you are scanning through a filter array, consisting of, say, 1000 x 1000 little squares that can very quickly change from being transparent to being opaque, and then you let your 1 pixel camera see the light that passes through that plate. For each random setting of transparent or opaque values, for each of the tiny squares in the array, effectively you are producing different "linear combinations" of the light over each point of the image that the 1 pixel camera sees. After your 1 pixel camera sees 10,000 different "linear combinations", you will basically have 10,000 bits of information about the image.
Now you have a reconstruction problem: given those 10,000 pixel values, how do you use them to produce, basically, a compressed image file? This is where you need to use some mathematics, and suffice it to say that if your filter patterns are sufficiently "random" and "independent", then it's doable.
You can extend this principle from a 1 pixel camera to a small array of pixels, say 10 x 10, and then reduce the number of recordings by a factor of 100 again.
Now, there's something really interesting you can do with this: you can photograph features smaller than seems to be physically possible. There are physical limits to how many pixels you can pack into a camera array, so that they each properly photograph a scene. However, because scenes can often be compressed greatly, there is often a lot of "high spatial frequency" (very small-scale structure information) that can be deduced from your coarser recording device. In other words, you can use "image priors" to "super-resolve" the scene. So, using compressed sensing + a high-density camera array, you can quickly (say, 100 frames per second) produce extremely high resolution images of a scene.
If you have very, very high resolution images, you can do very, very high degrees of magnification, as you might need for a gun scope.
Now, I think that all of what I said should work to make a light-field camera, as well -- where each pixel in the camera array doesn't just report the amount of each color of light hitting it, but also reports the amount of light from each different angle (or at least a sufficiently dense set of angles) hitting it. Why do that? Because it allows you to change the focal plane, giving you sharp and un-blurry magnifications. Obviously, this will make the files you work with a lot larger, but the memory footprint isn't really as important as the number of scans per second that you need to do.
Ok, now if you have a flat, light-field, compressed-sensing, super-resolution camera, it should be possible to produce a gun scope that can see quite far, and that is very compact -- way, way more compact that existing gun scopes.
I am not an expert on this stuff (optics), and it may be that there is a much simpler approach (an expert might say, "Lol! Yes, that will work, but it's stupidly inefficient. Here is a much better way..."); but it sure seems possible to me -- though, it may take a couple years before you see something like this manufactured as a mass-consumer product.