Local Install vs GPU / Render Farms (online GPU)

Viewing 6 reply threads
  • Author
    Posts
    • #15818
      AvatarZandebar
      Participant

        Hello

        This area is a fun topic, Local Install vs GPU / Render Farms always a fun discussion.  I’m personally having a mental battle with, which will be better for my needs and it all boils down to ‘cold hard cash’. As a hobbyist in this area I simply don’t have the funds to spend on the hardware locally (boy do you need hardware). That maybe a good thing as computing technology is constantly changing, in a contestant state of flux, so where do you get stability.

        Lets face it; consumer technology is always lower powered and a little bit of a play ground for emerging technologies before they hit server / pro grade hardware. The pro / server cards can range from ~$2700 (NVIDIA RTX 4500)  to ~$149,000 (Nvidia DGX A1000), consumer grade cards are geared to gamers’ NVIDIA GeForce RTX 4090  ~$2275. While creatives geared to entry level pro cards such as the ‘Quadro series’  or the newer RTX (A) 4,45, 5, 6000 series, these cards tend to be out of the budget of hobbyist / creative enthusiasts. With that and as the saying goes ‘ If YOU can’t buy RENT’ , for me that is starting to look appealing.

         

        I’m based in the UK and I have £1000 GBP ($1300 US)  and price wise it’s usually £ GBP for dollars, making whatever the dollar price is for an item, is the UK pound (£) price  (we do get ripped off here). The best I can budget for in a GPU at the movement is the GeForce RTX 4080 Super 16gb VRAM,  if I can fit a 24gb VRAM in a £1000 ish I will.  If the new GeForce RTX 5080 16gb VRAM looks good then I may spring for the difference in performance (but there isn’t a lot compared to the 4080),  I might also get a 4090 in that price bracket (after launch of RTX 5 series) but that looks unlikely.  That really sets up this thread..

         

        While am here a useful link to Nvidia website for a full range of high-performance GPUs 

        https://www.nvidia.com/en-us/studio/compare-gpus/

         

        Nvidia Blackwell

        While the future of Nvidia Blackwell architecture GPU’s is questionable, being made on a larger die x2 and the issues Nvidia are having with the complex production of the new RTX 5 series. Does beg the question with all the delays, is Nvidia going to make launch day which is planned for January 2025 at the Consumer Electronics Show.  Is the problematic production issues going to affect the physical cards when they hit the retailers, are we going to see problems with these cards. Like Intel is facing with their recent CPU’s and the current flop of the Intel core CPU, is this going to play out the same for Nvidia like it did for Intel. The benefits of the Blackwell architecture does look great providing there’s no physical hardware issues for the consumer. Reports say that they have fixed all of  the production issues which is public facing, but on a company level what issues remain that have not being publicly released. Only time will tell if this new series is a success and the advancements in GPU technology are realised.

         

        Hey! There’s a new kid on the block which lends itself to ‘Generative AI’ which are in development.

        These are ‘Probability Processors’ currently in labs under development, these processors if they come to fruition will revolutionise the Generative AI landscape. This area for me is really exciting hardware wise, as this will sit in-between traditional processors and quantum processors. OH! Exciting and the future is GLASS!

         

        Hardware shifts and your in a personal iteration; if you can afford to spend ~$2275 every time there’s a flagship GPU released then great your going to have a fantastic experience. When it comes to Generative AI, having the luxury of high VRAM and speed, great if your able to do that. Unless you have a good bank balance or a working professional this hardware will be out of a good percentage of Generative AI user’s. A compromise is needed to address this issue. As AI models / workflows get bigger as the hardware technology allows developers to do so.  As they say; ‘ If you build it, they will come’ … OK it’s a movie reference… you see what I mean right. (here’s the link) You need a technique to level the playing field!  Too much?

        I don’t know if I’m right here; I’m not on a laptop so people who are will probably have a different experience. Maybe they’ll go GPU / Render Farms are the only way I can produce my Generative AI output. Those of us who have ATX type hardware who can choose what GPU goes inside,  will have the freedom of choice limited by financial constraints (maybe).

        We know that model size will only get bigger as VRAM gets bigger; and 24gb is going to be the new 16gb and 16gb is going to be the new 12gb and so on. I’m thinking what are the limitation of each, what can I do and what can’t I do. with 12gb, 16gb, 24gb and 48gb (currently). As I’m learning at the moment, where is that wall and how can I address that wall in a local install. My current thoughts are don’t worry about the hardware for local installs, just buy the best card for my budget. Then work the issue, if I need more VRAM or speed (video) then page out the task to a render farm, then pay for what I use.

        The realities maybe that during my workflow I generally don’t exceed XXgb VRAM, so therefore my use of a render farm is infrequent. Or I may exceed XXgb VRAM and my use of the render farm is high, whereby my local install is used to hone the prompt. Cutting the trash out of my generations getting to that perfect image and only sending 10% of my output to outsource rendering. Where’s the balance I wonder?

         

        I’ve done the math over an average GeForce RTX 4090 GPU @ ~$2275

        Gives you a monthly budget to spend on a GPU Render Farm to a given hardware iteration cycle.

        • 24 Months $94.79
        • 36 Months $63.19
        • 48 Months $47.39
        • 60 Months $37.91

         

        My current card is GeForce RTX2070 Super 8gb VRAM, this is OK while I’m learning SD and figuring out which direction I will go in.  I certainly need an uplift in VRAM and speed for sure locally, I could go online render farms’ only. But I have no concept of the actual cost of doing that and implications that will incur, plus I also don’t know at this time how my usage of SD will be.  To a certain degree at this time it makes more sense not to upgrade GPU.  As I certainly didn’t get my moneys worth out of the RTX2070, even though it has been nice having since 2019.  With my current habits it makes more sense to farm out my GPU needs, until I assess my costs and usage.  Am certainly leaning towards online rendering as this has scalability and flexibility, given the move to dumb terminals. Hardware is getting more expensive and less units are being produced.  As we see the rise of thin PC’s & mini PC’s in the home and work place, whereby cloud computing is becoming the norm and having a traditional desktop is going to become more expensive for the average user. As production of traditional components drop as the need of desktop PC’s become less significant.  It’s back again to ‘Cold Hard Cash’…  State of flux of technology and modern component flakiness of hardware, as we’ve seen in recent years.  Does leave me thinking; its just not worth investing in the hardware as all I need at the end of the day is a data file at reasonable cost.

        Reasonable cost has a few factors: cost-effectiveness, time-effectiveness, ease of use and workflows.

        Here is my bias for hardware vs cloud there’s strengths in both, either all hardware or all cloud, I really do prefer all cloud.  At the end of the day is it better to straddle, eking the benefits of both solutions.  Or is it one vs the other, I really don’t know and I’m trying to work this out which is the best option given my bias.  I do love hardware but costs are getting prohibitive, there is strength in numbers (GPU farms) and if this was water it would be flowing down stream really fast.

        Figuring out a solution is always fun…

         

        All the best

         

         

         

         

         

         

         

      • #15820
        AvatarAndrew
        Keymaster

          I use all cloud services in my day job. There’s no hardware maintenance and no noise when running a heavy job. I am fortunate enough that I was allowed to keep the VM running 24/7. I won’t enjoy it as much if I need to shut it down every day.

          Some online services, like Think Diffusion, auto-shutdown after a specific amount of time. This is not a bad way to control costs.

        • #15846
          AvatarZandebar
          Participant

            On the GPU hardware side, I’m having trouble working out what you can do in each VRAM stack of each card.

            What are the limitation of each, what can’t you do with 12gb, 16gb, 24gb and 48gb (currently).

            How much headroom do you need to give other resources in VRAM other than the model, I’m having a guess at 2gb, I don’t know if that’s right but it looks fair.

            So that would potentially mean (am guessing).

            • 12gb = 10gb Max model size
            • 16gb =14gb Max model size
            • 24gb = 22gb Max model size
            • 48gb = 46gb Max model size

             

            How large do these models get, then there’s the workflow to consider how does that affect the VRAM?

             

            If you have a look at Flux.1 the information I found @

            https://medium.com/@researchgraph/the-ultimate-flux-1-hands-on-guide-067fc053fedd

            States:

            The regular version requires at least 32GB of system RAM. Testing shows that a 4090 GPU can fully occupy its memory. The dev-fp8 version is recommended for local use.

            * I assume when talking about system ram it means VRAM

             

            So what’s the difference of dev-fp8  to the regular model (this was covered in the courses)

             

            Would the GeForce RTX 4090 function using this model (GB size from Huggingface), we know that the new GeForce RTX 509o with 32gb (reportedly) will be able to handle this.

            flux1-dev.safetensors = 23.8GB

            flux1-schnell.safetensors = 23.8GB – This is the same size as pro

             

            Also Stable Diffusion 3.5 model

            This model fits inside the VRAM of the GeForce RTX 4090 but not the GeForce RTX 4080 Super Ti @ 16GB

            stable-diffusion-3.5-large = 16.5 GB

            These are the new models coming out and if I’m correct, these are starting to be prohibited for hobbyist / creative enthusiasts who can’t afford high VRAM Flagship GPU’s.  By running these models locally, what I’m trying to get at,  what can you do with the best card you can afford.

             

             

             

             

             

          • #15880
            AvatarAndrew
            Keymaster

              Several factors determine the relationship between model size and the required VRAM.

              1. Not all model parts need to be in the memory simultaneously. For example, The CLIP model can be unloaded after processing the prompt, and the VAE is required only after sampling.  So, the VRAM required is smaller than the model size.
              2. A model’s size is measured by the number of parameters. A parameter can be represented in different precisions in a GPU, e.g., FP32 (high), FP16, and FP8 (low). The lower the precision, the smaller the size of a GPU, but the quality may be reduced.

              Optimizations like these enable fixing large models in limited VRAMs.

               

               

            • #15892
              AvatarZandebar
              Participant

                That does add an extra layer of complication to what I’m trying to work out.

                It looks like I’m going to need some time to get my head around the GPU issue and what it can or can’t do at a certain VRAM.

                I just need to work out lets say, @ 16Gb when would I need to use an online GPU service to render a certain model, surely I should be able to look at a model and say OK that model won’t work on this local GPU. Therefore,  if I want to use that model I’ll need to use an online GPU service. With that explanation it’s not so straight forward and its just a matter of when the GPU will crash and give you an error. Surely that’s not the case is it? As you should be able to apply some logic somewhere.

              • #15895
                AvatarAndrew
                Keymaster

                  I agree it’s not entirely straightforward. I usually deal with memory issues as they arise. This can be done using a more memory-efficient version of the model (fp8, fp4), using a smaller image size, and unloading models from memory, etc.

                • #15904
                  AvatarZandebar
                  Participant

                    I’ve decided that I’m going to hang on upgrading the GPU for when Nvidia launch the RTX 50 Series, and see how they perform. Given the stats it should be impressive, only time will tell,  also the tech issues they’ve had in production. I’m just worried that it may leak into an Intel type issue with the hardware, 12 months after release should be enough time to work this out. Then your 12 months behind, I was kind of hoping to pick up a 4090 in my budget after the launch but that’s looking unlikely as they’ve reduced available stock. Which is keeping the price the same, I’ve been looking in the Black Friday sales and prices remain the same. Which is a bit of a surprize given that the launch of the 50 series is well known, they’ve handled that well to maintain the price.

                     

                    It’s looking most likely that I’ll end up with a RTX ??80 series of some description with 16GB, either 40 0r maybe 50 series, I may spring the extra cash and get the 5080 when it comes out. After the cards have been reviewed and then I’ll work out which way I’m going to jump.

                    That’s why I’m really interested on the limitations and when I’ll need to use a  GPU server farm, maybe a better way to go in the long run. Reading around; the FLUX models can completely fill up the VRAM on consumer cards so I’m considering maybe it’s GPU server farms only.  I do need a performance lift with my present hardware GPU, it’s a matter of working out the pro’s and con’s.

                     

                    I still don’t know where I’m heading with SD I have to work that out first.

                     

                     

                Viewing 6 reply threads
                • You must be logged in to reply to this topic.